ESXi 4.0 autoinstall

Being, first and foremost, lazy and getting my paychecks for being a system administrator, I felt that the amount of work involved in loading ESXi 4.0 on my blades was entirely too much. I have well over 100 blades, each one needing to have vSphere loaded onto it, configured, and added to vCenter. Even using the directions scattered across the internet about reducing the amount of effort involved in loading vSphere was too much for me.

Others have documented how to PXE boot ESXi elsewhere on the internet, however I wasn’t interested in having a “stateless” install…I merely wanted to automate installing ESXi to the local hard drive. My blades have a single hard drive, a single generation one SSD or two SAS drives in a RAID 1 depending on the vendor, and I simply want the installer to always install to that drive without bothering me. Loading from the “remote media” functionality of the DRAC/iLO for the blades takes forever, so I wanted to be able to install using PXE and push the media over that medium.

So, having been a developer for several years I decided to dive further into the the install process than others had detailed. Turns out that eliminating all input from an administrator to load the operating system was pretty simple.

The end result is that I am able to power on a blade, hit F12 to have it PXE boot and walk away. Some time later, we can use PowerShell and the PowerCLI to find the hosts (they will be somewhere in the DHCP scope of the provisioning LAN), give them a permanent IP and hostname, then configure them and add them to vCenter. By using PXE and the interactionless (yes, I did make up that word) install, I cut the time to load ESXi from about 45 minutes (using the remote media function takes FOREVER!) to less than 10.

The entire install process is controlled by a single module appended to the comboot process…a module aptly named “install.tgz”. I have done all of my work from a linux host (one which is also my PXE host), so these instructions reflect that. I’m quite sure you can do this from Windows, but I don’t know how.

The first thing we need to do is setup our working environment and copy some files around. Note that I’m copying the contents of the ESXi install CD to a location inside the tftp server’s document tree (for me, the “root” location for the tftp server is /tftpboot).

Now that we have the contents of the install module extracted somewhere we can work with, it’s time to do some editing. Using the above “working” directory (/tmp/vsphere), edit /tmp/vsphere/usr/lib/vmware/installer/ThinESXInstall.py,changing lines 22-23 to look like the following:

This is exactly the same as has been documented many other places showing how to abbreviate the install process with as few inputs as possible from the administrator. To take it a step further and eliminate input, you will also need to edit the file /tmp/vsphere/usr/lib/vmware/installer/ThinESX/ThinESXInstallSteps.py, changing the “TargetSelectionStep” (approx line 56) method so that it looks like the following:

The change that occurs is the final if/else statement. The original simply has the line beginning with “return LaunchDialog ...“. What we have accomplished with the change is checking to see how many install targets were found; if there is only one, and it’s a local disk, just install to that location. If there is more than one target, if the target is not local, or anything else doesn’t add up then it will present the normal device selection dialog. This has the advantage of not auto installing if the server has both a local disk and a fibre channel HBA that reports LUNs as “local” storage (I’m not sure how common this is as it doesn’t happen with my HP blades), because there would be more than one target in the array, thus causing the administrator to be prompted.

The final steps are to recreate the install module for the boot process and to add the lines to the PXE configuration file to allow ESXi to be booted/installed via the network.

Make sure that you set the paths noted here to the paths that are correct for you and your PXE server’s configuration. In my environment, I have /tftpboot as the tftp server’s root. I store the boot images in subdirectories of /tftpboot/images and the pxelinux configuration file at /tftpboot/pxelinux.cfg/default.

Now that everything is configured, boot the server and specify PXE during the POST process (F12 for IBM, HP and Sun blades). Once the server is at the PXE boot prompt, I can specify esxi4 or esxi4_default depending on if I want to just let ESXi install itself or if I want to use the default install module provided by VMware.

So, all I have left is to cajole Glenn into posting how to configure the host using POSH and PowerCLI : )

19 thoughts on “ESXi 4.0 autoinstall”

  1. Nice work.. question though, when recreating the install module, the tar czvf command includes the usr dir where files were modified and a sbin directory that was not referenced earlier. Why?

    Reply
  2. The install module contains two directories: sbin and usr. I didn’t have to modify anything in the sbin directory, so I had no need to reference it before then.

    However, it is still necessary for the install process to work correctly, so it must be included when recreating the module.

    Thanks for reading!

    Andrew

    Reply
  3. Great job Andrew. We are trying to do the reverse. For our Boot From SAN installs, we would like for the installer to automatically select and install to our 40GB LUN. This is the standard size on all of our BFS LUNs. Since there are could be other LUNs that would show up as well and the LUN ID could change, is there away to have the installer look for the LUN size of “40G” and make that the default LUN and perform the install automatically? Thanks!

    if len(targets) == 0:
    raise NoValidDevicesException()

    #maybe something like, is there a way to get the targe size?

    if len(targets) >= 1 and targets[size]== “40G”:
    data[‘Target’] = targets[0]
    return data
    else:
    return LaunchDialog(DeviceSelectionDialog(targets, data))

    Reply
  4. sorry for the typos…I meant to say:

    if len(targets) == 0:
    raise NoValidDevicesException()

    #is there a way to get the target size?

    if len(targets) >= 1 and targets[size]== “40G”:
    data[‘Target’] = targets[size]
    return data
    else:
    return LaunchDialog(DeviceSelectionDialog(targets, data))

    Reply
    • Huy,

      Thanks for reading the post! There is a huge amount of information about the system available during the install process, but very little of it is actually needed. I was able to cheat somewhat as VMware’s developers already use the methods from the underlying C(++?) that actually talks to the hardware to present the information in the dialogs during install.

      I believe that this will work for you. I’m not at home right now, so I have no access to my test/dev environment and am unable to actually execute the modifications, so please keep that in mind when you use this. If you do get any errors, please let me know!

      The following modifications would need to be made to the TargetSelectionStep, with the final version looking like this:

      def TargetSelectionStep(data):
         """TargetSelectionStep
         This install step is responsible for presenting the user with the device
         selection dialog and determining the target which is being installed to."""
         targets = TargetEnumeration(NotPredicate(RACVirtualMediaFilter))
      
         if len(targets) == 0:
            raise NoValidDevicesException()
      
         # get some helper methods from another file
         from Dialogs.DeviceSelectionDialog import HumanReadableSize, GetDiskTypeFromLun
      
         for target in targets:
            # look for a Fibre Channel LUN that is 40GB in size
            if HumanReadableSize(target.GetSize()) == "40GB" and GetDiskTypeFromLun(target) == "FC":
               # yay, we found one...now install to it
               data['Target'] = target
               return data
               break
         else:
            # oh noes, no 40GB fibre channel LUN was found, I should ask a human what to do
            return LaunchDialog(DeviceSelectionDialog(targets, data))
      Reply
  5. Andrew,

    Thank you so much for responding and for providing those steps above. For some reason, it could not find my 40 GB SAN LUN until I made the following changes. Not sure why the == “40G”, “40.0G”, “40.0GB” or “40.0 GB” did not work. I ended up using <= "40G" and it works just fine. Thanks again.

    def TargetSelectionStep(data):
       """TargetSelectionStep
       This install step is responsible for presenting the user with the device
       selection dialog and determining the target which is being installed to."""
       targets = TargetEnumeration(NotPredicate(RACVirtualMediaFilter))
    
       if len(targets) == 0:
          raise NoValidDevicesException()
    
       # get some helper methods from another file
       from Dialogs.DeviceSelectionDialog import HumanReadableSize, GetDiskTypeFromLun, TargetList
    
       for target in targets:
          diskType = GetDiskTypeFromLun(target)
          size = HumanReadableSize(target.GetSize())
    
          # look for a Fibre Channel LUN that is 40GB in size
          if diskType == "fc":
                 if size <= "40G":
                       # yay, we found one...now install to it
                       data['Target'] = target
                       return data
                       break
       else:
          # oh noes, no 40GB fibre channel LUN was found, I should ask a human what to do
          return LaunchDialog(DeviceSelectionDialog(targets, data))
    Reply
    • Huy,

      It may be more reliable to use the original format of the disk size for the comparison you are using ( “size <= 40G“). Size should be a string, and you are comparing it against a string, but I think that both of them will be converted to strings for the <= comparison. Perhaps something like this:

      for target in targets:
            diskType = GetDiskTypeFromLun(target)
            size = target.GetSize()
            # size above is in bytes, we want to install to a LUN of <= 40 GB, so we need that number of bytes
            installsize = 40 * 1024 * 1024
       
            # look for a Fibre Channel LUN that is 40GB in size
            if diskType == "fc":
                   # compare two integers for size...
                   if size <= installsize:
                         # yay, we found one...now install to it
                         data['Target'] = target
                         return data
                         break
      

      Hope this helps. Happy to hear that it's working for you!

      Andrew

      Reply
  6. Great work! This is eaxctly what we needed.

    Along the lines of automating the install, we are wanting to use Powershell to add the host to vCenter and apply previously defined host profiles. We chose Powershell PowerCLI over RCLI for this task. However, I not figured how to get past these two issues.

    1) Pass IP and Subnet mask info via the $additionalConfiguration variable. See example 5 of the https://upgrade.vmware.com/support/developer/windowstoolkit/wintk40u1/html/Apply-VMHostProfile.html With the use of vNetwork Distributed Switches, we need to pass the IP and subnetmask (used for the vmkernel) when applying the host profile.

    some of the code:

    #######
    $profile = Get-VMHostProfile -Name ESXi4-sdwvsh101
    $applyhost = Get-VMhost sdwvsh100.albertsons.com
    Apply-VMHostProfile -Entity $applyHost -Profile $profile -Confirm:$false
    Test-VMHostProfileCompliance -vmhost $applyhost
    $additionalConfiguration = Apply-VMHostProfile -ApplyOnly -Profile $profile -Entity sdwvsh100.albertsons.com

    $profile ##Returns:
    ServerId : @sdwpaps215@443
    Description : Testing CIM provider
    ReferenceHostId : HostSystem-host-23397
    Id : HostProfile-hostprofile-27
    Name : ESXi4-sdwvsh101

    $additionalConfiguration | select Name | format-list ## Returns:

    Name
    —-
    Name : network.dvsHostNic[“key-vim-profile-host-DvsHostVnicProfile-dvSwitch_WP-dvPG-NFS-VMotion-vmotion”].ipConfig.IpAddressPolicy.subnetmask

    Name : network.dvsHostNic[“key-vim-profile-host-DvsHostVnicProfile-dvSwitch_WP-dvPG-NFS-VMotion-vmotion”].ipConfig.IpAddressPolicy.address

    When I run command line below I get the error…below. 10.52.8.11 is the Vmkernel address I use for NFS and VMotion. Which is configured on the vDistributed Switch.

    $additionalConfiguration[‘network.dvHostNic[“key-vim-profile-host-DvsHostVnicProfile-dvSwitch_WP-dvPG-NFS-VMotion-vmotion”].ipConfig.IpAddressPolicy.address’] = ‘10.52.8.11’

    $additionalConfiguration[‘network.dvHostNic[“key-vim-profile-host-DvsHostV
    FS-VMotion-vmotion”].ipConfig.IpAddressPolicy.address’] = “10.52.8.11”
    Array assignment to [network.dvHostNic[“k …] failed: Cannot convert value “network.dvHostNic[“key-vim-profile-host-DvsH
    vPG-NFS-VMotion-vmotion”].ipConfig.IpAddressPolicy.address” to type “System.Int32”. Error: “Input string was not in a cor
    At line:1 char:26
    + $additionalConfiguration[ <<<< 'network.dvHostNic["key-vim-profile-host-DvsHostVnicProfile-dvSwitch_WP-dvPG-NFS-VMotion
    sPolicy.address'] = "10.52.8.11"
    + CategoryInfo : InvalidOperation: (10.52.8.11:String) [], RuntimeException
    + FullyQualifiedErrorId : ArrayAssignmentFailed

    #######

    2) Change the IP address of an ESXi4 host after it's been installed.

    Appreciate any guidance or help figuring these out.

    Thanks,
    Tim

    Reply
  7. I’m working on a similar script to select the proper local disk on our IBM blades. Is there any easy way to test this without rebooting and restarting the install? I’m assuming there are some errors being spit out, but I’m not able to see them before the selection dialog is launched.

    Reply
    • @Tim,

      Glenn has responded far better than I can to your question in a post here. I hope that answers your question.

      @Tad,

      I felt it was easier to make my answer into it’s own post where I can elaborate more. You can find the post here. I hope it answers your question, and you are more than welcome to ask any questions you might have.

      Thanks for reading!

      Reply
  8. @Huy,

    Try the following. The size is returned in the format “%5.1f” (3 digits + decimal + 1 digit) so you either need to strip the blanks from the front of the string, or add the blanks in the string you are comparing it to.

    size = HumanReadableSize(target.GetSize()).strip()
    if size == “40.0 GB”:

    or

    size = HumanReadableSize(target.GetSize())
    if size == ” 40.0 GB”:

    Reply
  9. Hi there,
    I’m trying to install esxi on a remtote server, outside my local network, which I don’t have physical access but until now I couldn’t wan’t able to.

    I successfully installed CentOS remotely by adding this to grub.conf

    default 0
    timeout 30
    title Centos Install (PXE)
    root (hd0,0)
    kernel /vmlinuz.cent.pxe vnc vncpassword=mypass headless ip=173.227.112.84 netmask=255.255.255.192 gateway=173.227.112.65 dns=213.186.33.99 ksdevice=eth0 method=http://ftp.hosteurope.de/mirror/centos.org/5/os/x86_64/ lang=en_US keymap=us
    initrd /initrd.img.cent.pxe

    After booting, vncserver starts and I can connect to the server and install CentOS.

    It’s possible to install esxi in the same way?
    if so, how !?
    Thanks

    Reply
  10. Hey Andrew,

    awesome post.. thanks a lot for sharing the information..

    one question I had was, is there a way I can enable SSH by default during the install ??

    Thanks a lot
    Appreciate your help…!! 🙂

    Reply
    • Braga,

      EDIT: So, I seem to have confused myself and thought your question was asked in regard to the kickstart process for ESXi 4.1. I’ll do a little research and get you the answer for ESXi4 as soon as I can.

      Andrew


      As part of the %firstboot section of the kickstart file you can include the following command:

      vim-cmd hostsvc/enable_remote_tsm

      So your final kickstart may look something like:

      vmaccepteula
      rootpw super_secret
      autopart --firstdisk --overwritevmfs
      install url http://a.b.c.d/ESXi/4.1/
      network --bootproto=dhcp --device=vmnic0
      reboot
      
      %firstboot --unsupported --intrepreter=busybox
      vim-cmd hostsvc/enable_remote_tsm

      Hope this helps!

      Reply
  11. Andrew,

    Your last reply really helped me Thanks a Lot..

    One more question I had was about busybox in ESXi 4.1.
    Is there a way I can upgrade Busybox with more supported commands ???

    I need to use tftp client on the ESXi 4.1

    Thanks
    Braga

    Reply
    • Braga,

      I don’t know of a way to add additional functionality to busybox. It is very minimal, so there is no package management or compiler. You may be able to check the main busybox project to see if they have add in packages.

      That being said, it is probably a bad idea to add software. Not only does it complicate your load process (you now have to decompose each ESXi release and add software), but it most likely puts your ESXi in an unsupported state by VMware, and could potentially add in some undesired after effects (changing libraries, etc. could cause ESXi’s primary functionality to fail).

      Andrew

      Reply

Leave a Reply