It was a big mistake on my part but I did it: I’ve set a weird root password on my ESXi, I thought I’d remember it, didn’t write it down, didn’t setup SSH key access or vSphere… and lost access to the admin of my server. Bummer. The official documentation says “reinstalling the ESXi host is the only supported way to reset a password“, but I thought it was kinda lame and they simply didn’t wanted to officially support shady config file manipulation. Surely there’s a way to just replace the password like you would on a standard Linux box, right? right?…. Well… Not quite. Here’s how I had to do it.
Read more: ESXi 7 root password resetFirst, how is my server set up? At the time it was running ESXi-7.0U3m, hosted at OVHCloud, the host doesn’t have any TPM, already had SSH service turned on (if you don’t there’s still a way), and I can force reboot onto a recovery system that is able to access the data on the discs without booting the installed OS, as if I have physical access to the machine itself (it also supports IPMI which effectively gives full control like I’m sitting right in front of the machine, but it’s not necessary in that case).
TL;DR: I had to reboot the server in rescue mode, mount the correct bootbank, fetch state.tgz, decrypt local.tgz.ve by installing a temporary ESXi server and replacing its encryption key, then add the SSH keys into the decrypted local.tgz and put it back on the server, so I can SSH back again and change the root password.
According to the most popular tutorial on how to manually reset an ESXi password, you just reboot your server on a live CD (or on the rescue system in my case), mount the bootbank partition (sda5), open state.tgz, then local.tgz, change the etc/shadow file, repack everything back together, replace the state file in the bootbank and you’re good to go. So I went to work: I halted all my VMs, force rebooted the server into rescue, and grabbed the state.tgz file. And that’s when I discovered that, since ESXi 7.0.3, the local.tgz file is encrypted! That was very underwhelming. Its filename is local.tgz.ve and it should be accompanied by an encryption.info file, containing the encryption keys.
First I thought: well that doesn’t look so bad, surely there’s a tool to decrypt it using the encryption.info data, it’s just a little setback. Well, not so fast, there’s two problems: first, you have to use an ESXi 7 system to decrypt such a file (you can only use the proprietary crypto-util command, and it’s not available on Linux), which is quite difficult when you don’t have access to your server or any other server. And second, each server uses its own encryption keys, so you cannot just “use” or “import” the decryption key (at least I didn’t find any easy way, tell me in the comments if you do). Well, there’s still a way to bypass those two problems.
So I need to gain access to another ESXi server somehow. I could install it on a spare machine, but it’s a mess, I didn’t have the hardware on hand, and I don’t even know if it would be supported at all… Maybe a VM? I tried with VMware Workstation 17 on Windows 10, but my CPU doesn’t support virtualizing Intel VT-x (nested virtualization). I though I was out of luck, but I tried it anyway: I lied to VMware by pretending I was installing Windows 10 instead, without nested virtualization support. I didn’t have high hopes, but as it turns out, you can install and run a fully virtualized ESXi system without proper CPU support! The installer warns you about it, but to my surprise it doesn’t prevent you from proceeding with the installation anyway. You won’t be able to run any virtual machine obviously, but that’s not what I needed, so it was fine for me. That’s it for step one.
So I finally had access to that crypto-util command. I know the command line to use, found it on some other post: “crypto-util envelope extract –aad ESXConfiguration local.tgz.ve local.tgz“. Will it work? No, it doesn’t read the provided encryption.info file, and there’s no proper way to use it. My virtual system has its own encryption.info file with a different keyset, so I get an “ESXi kernel key cache error“. I would have to import it somehow. Maybe… by replacing the key on the virtual system! It turns out the local.tgz file doesn’t need to be encrypted to be used by the system on boot, it only tries to decrypt it if the local.tgz.ve file is present (see how the /bin/auto-backup.sh file works). So what I did is: uncompress /boobank/state.tgz, decrypt the local.tgz.ve file, then remove it, and replace the encryption.info file with the one from my server. I then made the state.tgz file again including both files and rebooted the virtual system.
mkdir /tmp/a
cd /tmp/a
tar xzf /bootbank/state.tgz
crypto-util envelope extract --aad ESXConfiguration local.tgz.ve local.tgz
rm local.tgz.ve
(replace encryption.info)
tar czf /bootbank/state.tgz encryption.info local.tgz
To my amazement, the virtual server happily booted and replaced its own encryption key with the same key used by my real server! That means I was finally able to decrypt my server’s local.tgz.ve file, gaining access to all the configuration files. Success!… Well, not yet: on ESXi7, there’s no shadow file anymore, the root password is stored in one of its proprietary config files, and it’s a pain to change it. So instead, I’ve set up SSH key login by placing my public keys into the etc/ssh/keys-root/authorized_keys file. That way, I’d be able to login without a password, and change it using the passwd command. Easy, right?
Well, That’s all that was left to do. So I halted all my VMs and force rebooted my server for the second time, grabbed the state.tgz file out of the server, did my little kitchenery to uncompress/decrypt/replace/repack everything, replaced the resulting state.tgz file back into place, rebooted and… nothing. I still couldn’t login, as if I did nothing! And that’s pretty much what I did, nothing: I forgot about the altbootbank… I changed the wrong bootbank! That’s right, ESXi uses two boot filesystems, in case one doesn’t boot anymore after an update. So it switches back and forth after each update. sda5 is the first bootbank, and sda6 is the second. I only did sda5. So, after redoing the same process all over again on the correct bootbank, it finally worked: I was able to SSH into the server, set a new password, and save it to a secure place. I also kept my SSH keys in place, just in case. And I will certainly not make the same mistake again.
Gold man, this is gold.
I ended up solving using a similar method and found this along the way and figured I’d give a shout out for creativity.
The other thing you could have done was taken the config file from the known host and use it to over write the proprietary file on the broken system.
Is there any solution for those cases when the encryption key is not found inside “encryption.info” and is instead stored in the TPM firmware??
Thanks in advance!
Kind regards.
As I mentioned in the article, the presence of a TPM module makes this method rather unusable because the configuration is then encrypted by hardware directly. I guess there’s not much you can do apart from reinstalling. Maybe you could decrypt the config file if you run another install of ESXi but on the same server, so you use the same TPM module, but I would not be very hopeful. Next time, save the password in a safe place, and/or if running the SSH server at all time isn’t a security issue, put a SSH key as a backup.
Thanks for the info, I’ll try to investigate a bit more. Turns out I was wrong, and the “encryption.info” file IS present even when the keys are stored in the TPM firmware! I have just tested it on a recent Dell server.
The “encryption.info” file contains the following:
.encoding = “UTF-8”
includeKeyCache = “FALSE”
mode = “TPM”
And then a bunch of lines that start with “t.” Then a line with “encKey”. And lastly a bunch of lines starting with “p11.”
I suppose that those are public keys, and I suspect the private keys will be hidden inside the TPM.
I guess it’s impossible to crack (unless someone finds an exploit or vulnerability) đ
Sir, GREAT NEWS! Your method works also with a TPM! The only requisite is that you need physical access to the server (because the private keys are stored inside the TPM)
I’ve just tested it on a Dell PowerEdge R350 with a discrete TPM chip (manufactured by Nuvoton, firmware 7.2.2.0)
First I removed all the original disks from the server, and inserted a blank SSD. Then I installed the exact same version of VMWare ESXi on the SSD and let the installation complete fully. Afterwards I followed your instructions (making sure to replace the “encryption.info” file with the one from the original disks).
Thanks again for your work! I’m super happy with this result!
Hello,
Thanks a lot for this explanation.
When I tried to decrypt the local.tgz.ve file from my old esxi i got this error :
crypto-util envelope: ESXi kernel key cache error searching for XXXXXXXXXXX’: Not found.
Do you have any idea on what can be the issue ?
Thanks a lot.
Hi hub2rock,
As stated in the article (but maybe it’s not very clear), you have to replace the encryption key on your new ESXi system (the one you’re trying decrypting on). For that, you have to first decrypt the original local.tgz.ve from your new esxi, then remove the .ve file, and replace the encryption.info file with the one from your old esxi. Repack state.tgz, reboot, and now your new esxi server will use the same encryption key than your old server! You’ll then be able to decrypt the old local.tgz.ve file using the same command as you did. For more detailed explanations (but still not very clear, I’m sorry, it’s a quite confusing method), reread the article from the paragraph that starts with “So I finally had access to that crypto-util command.” Hope it’ll help.
Hello Admin,
I’m pretty sure that’s what i’m doing but still getting the same error…
Hum ok, on my second try it works. Thanks a lot.
Hi,
I Follow the kb of
https://knowledge.broadcom.com/external/article/324525/modifying-the-rclocal-or-localsh-file-in.html
I lab with copy /etc/rc.local.d/local.sh to the extracted local.tgz structure ( etc/rc.local.d/)
and modify the local.sh
append the /etc/init.d/SSH start before the line of “exit 0”, it works
thanks for your mind map
Insert the /etc/init.d/SSH
Thanks for the tip! I was not very fond of the SSH requirement of this method but apparently it is not a requirement anymore, thanks!
Not sure if this applies to 8.0U2. But if it does I am, after two attempts, just getting:
crypto-util envelope: ESXi kernel key cache error searching for XXXXXXXXXXXâ: Not found.
I believe I at least understand the initial steps to take:
Install second system and unpack the new systems /bootbank/state.tgz, from which to remove the unpacked encryption.info and local.tgz.ve files. Insert the encryption.info from the first system and then repack these two files back into /bootbank/state.tgz.
Reboot the new system with the repacked state.tgz and then copy the /dev/sda5/state.tgz from the first system to the new system in order to decrypt it. Which is where it fails for me.
If I could get past the key cache error) I find it unclear as how to proceed with the rest.
I think you mean that the edited files should be repacked into a new local.tgz, which then in turn be repacked with the original encryption.info to finally create a new state.tgz which will be copied to /dev/sda5/state.tgz AND /dev/sda6/state.tgz?
I did have a look at the local.tgz content from the second system and don’t really know how edit the files. Namly adding my public key to authorized_keys, or how to make SSH autostart (it was not set to do that)
These are the contents of local.tgz in 8.02U:
chkconfig.db
dhclient-vmk0.leases
krb5.conf
random-seed
security/
ssh/
vmware/
and ssh contain:
ssh_host_ecdsa_key
ssh_host_ecdsa_key.pub
ssh_host_rsa_key
ssh_host_rsa_key.pub
Hello! I’m trying your method but I have a problem that each time that I reboot the vm a new key is set so the old key never gets loaded into cache. I failing to see what I’m doing wrong and why there’s a new key generated each time on reboot of the vm. I’m compressing each time with the old encryption.info file with the new decrypted local.tgz file into a new state.tgz. I have tried to remove state.tgz in /bootbank before creating the new one.
But for some reason the system creates a new state.tgz upon reboot since the keys inside encryption.info is different each time that I have repackaged with the encryption.info from the server that I need to reset the password on. Also I noticed there’s a command –keys in crypto-util but I cant find any information if it would be possible to load the right key directly to decrypt the local.tgz.ve file that way so I dont have to reboot. Cant really find any documentation on crypto-util..
Also on sda6 there’s only one file so I there’s no state.tgz to be modified there on my system.
I have tried 7.0.3f and 7.0.3n on the vm.
Any ideas?
Do you have a TPM? If yes, you’ll probably have to follow the advice from another comment saying that you have to temporarily remove all storage and install the new system on the same hardware to share the same TPM keys. If not, make sure you’re changing the right bootbank, as there are two of them. And last, make sure there are no leftover .ve files or anything, you should leave everything as is, just put the new decrypted local.tgz and old encryption.info files into state.tgz. Finally, I don’t know if it can be an issue but make sure to do all this outside of the bootbank folders, for example in /tmp, just to keep the linux permissions while recreating the files.
Hi.
I tried but no luck yet!
I created ESXI 8 virtual machine with VMWare Workstation in order to be able to replace stuff in my state.tgz file.
When I reboot the virtual ESXI and extract the ‘/bootbank/state.tgz’ file, it will bring back the old encryption.info file …
I checked the permissions. TMP is not the problem here since ‘mode = “NONE”‘ in my original encryption.info file.
I’m slowly starting to loose my mind here. Any idea?
I’ve looked through all files with the local.tgz, but haven’t seen the root password. Could you please tell me which file contains the root password?
Hey admin,
your manual is great but i have the same issue Andreas described. I use 7.0.3f and everytime i reboot my machine (also hard reset within runtime) gives me the old state.tgz back with the old encryption.info.
After changing the file and waiting a bit, it also will get recovered by the original enc.info
In / i found a .#encryption.info which is write protected.
Did you have any ideas to get this work? Maybe an version conflict? But my recover System is 7.0.3 f
@Andreas Did u found a solution?
@justin
You have to add the ssh cert auth files in the config as described.
“Iâve set up SSH key login by placing my public keys into the etc/ssh/keys-root/authorized_keys file.”
Then login with ssh key auth and change the root password there.
Hello guys, I’m having the same issue as Olli, everytime I replace encryption.info and tar local.tgz to rebuild state.tgz I reboot. After rebooting and extracting the encryption.info from state.tgz i get the initial key of the new ESXi not the one I want to use as a replacement.
System is 8.0.3 ver.
Does someone know of a setting that i’m not aware of ?
Hi!
Andreas Larsson, Dimitri, Olli, Baptiste, I found a solution!
Before, I connected to a non-primary server via SSH and did âmy little kitchenery to uncompress/decrypt/replace/repackâ and replaced the file state.tgz â its not work for me â after reboot file state.tgz was replaced with the original encryption.info
And then I tried to do âmy little kitchenery to uncompress/decrypt/replace/repackâ, save the file to USB, and replace it via LiveCD (in /dev/sda5). And I have amazing too!!!)))
I was able to decrypt the file from the main server.
BUT: I can’t access the main server because SSH is disabled!(
Terry or anybody, help me please.
I go to – https://knowledge.broadcom.com/external/article/324525/modifying-the-rclocal-or-localsh-file-in.html
And next link donât work (404 error) – Editing configuration files in VMware ESXi and ESX (1017022).
How can I edit etc/rc.local.d/local.sh, if I can’t access via SSH. In LiveCD I dont see directory /etc/
I see only (sda1: EFI system / sda5: only files, not directory / sda6: 1 file boot.cfg / sda7: VMFS)
How Editing config file local.sh without SSH?
how do you “deposit” your ssh keys if you cant actually login to the server? live linux wont load the /etc/ssh folder, only the bootbanks.
how do yo ssh into a server you dont have the password for, or replace the ssh keys if you cant see its filesystem? the live linux can only load the bootbanks.