Backup services state

This morning I had a couple app servers just giving me fits. I turned to powershell to quickly to a diff on the servers. I started to backup the regisry and do a diff there, but decided that I should start somewhere simple. My solution was to “snap” the state of all the services on one that was working. I then restored those setting on one of the trouble nodes, rebooted problem solved.

gwmi win32_service | % { write-output "sc config $($ start= $(($_.startmode).replace('Manual','demand')) " } | out-file restore_service.bat

ITIL, U-TIL, we all scream for…Configuration Management?

Ok, so the title is a little misleading. Configuration Management is a part of ITIL, however I’m not going to talk about ITIL, at least not directly.

As an administrator I’m responsible for multiple systems. Some of these are identical, e.g. Apache servers, MySQL servers, some of them provide unique, stand alone, services. However, they all have some things in common…sshd configuration, log rotation schedules (logrotated), etc.

It’s a PITA to keep up with all of these servers individually. A global change can take quite a bit of time, especially with our ever increasing number of ESX hosts. So, how do I make my job easier, myself more productive, and next year’s raise larger? Automated configuration management.

Read more

Punctuality is Important

Time keeping is especially important for Active Directory and Kerberos. I encountered an error when I was attempting to ssh into one of my AD enabled ESX hosts. The SSH error was “Permission Denied”, however after inspecting the logs (/var/log/messages) I discovered that pam_krb5 was throwing “Clock skew too great” errors.

This was odd to me, as I know every one of the ESX servers has NTP configured. Apparently ntpd died at some point, which caused the clock to begin losing time. Once the time difference between the domain controller and the ESX host exceeded 300 seconds (5 minutes), ESX no longer allowed me to login using AD credentials.

The fix was somewhat easy…reset the clock. Since I was able to login to the console, I did so as root, and executed ntpdate name.of.domain.controller, which forced it to sync the clock with the DC. After that was taken care of (which confirmed that it was ntp that broke), I went back to Virtual Infrastructure Client and reset the NTP settings for the host (it’s on the Configuration tab).

Provisioning server for VM’s

Andrew and I recently reorganized our VI at work.  One of the key changes was the concept of datacenters via function… Without getting to far into it. One of the functions we identified, a stand alone resource pool to deploy all VM’s from.   We’re referring to it as our provisioning cluster.  Basically whenever we get a request for a new VM.  The VM is deployed there and then VMotioned to it’s appropriate resource pool only after everything is verified, and documented.

Well done with the theory.  I started to organize all our templates through VIC, but quickly relied we have a ton of them!  Win2k, Win2k3, Win2k8, RELH4, Solaris 10… all or which have x86/x64 variants for each of our licensed options… Standard, Enterprise, Datacenter… etc.  Did I mention all I did for a month was build templates.  Anyways they where everywhere, and I was not looking forward to this.  Then I read this post From Hal’s blog, and quickly realized that with was something worth scripting.  The version I used at work looked like

Get-Template | get-view | % {$_.MarkAsVirtualMachine((get-cluster "pool1" | Get-ResourcePool | get-view).MoRef, (get-VMHost "ESX1.localdomain.local" | get-view).MoRef); $_.MarkAsTemplate()}

but that kids is not what I would call production ready.  That’s what I love about PowerShell I had a one time task… boom one line!  Took 15 min to find/move all of the templates we had in our env.   As I was added this script to our internal Wiki it occured to me someone could probably build on this the same way I built on Hal’s post.  So here is a slightly more polished version.


Does anyone know the password for this database?

Those that I work with know that my first, and primary, job is as a MySQL DBA. Unfortunately, cause I love MySQL, I haven’t been doing as much with it lately because of all the virtualization work going on.

Today I’m going to post about MySQL. Occasionally you may encounter a MySQL server that has been around for a while, and no one knows who set it up, where it came from, or who owns it. Those wonderfully inaccessible databases are still someone’s responsibility. So, what do you do if you don’t know the root password? Well, it’s actually not all that difficult, assuming you can start and stop the instance a few times.

Read more

Convert DN to Canonical and back

I’ve been revamping our user account creation process lately (more on that when I finish it).  I started with the quest cmdlets, but performance/limitations lead me back to adsi.  Along the way I kept having to go from canonical naming (domain.comousubousizemore, glenn) to the more common distinguished name.  After the third time I did this by hand i decided to script it.  I wrote the following functions to handle the conversion.

Note:  Should these be named ConvertTo-* or ConvertFrom-* ?

Fedora 8 Suspension

I’ve previously mentioned that I use Fedora 8 on my laptop at home. It is a Core 2 Duo Dell with a GeForceGo 7300. Originally, it had Vista Home Premium, and I really did give Vista a chance (for almost 8 months!!), but I just like linux more. I do still have to go back to Vista on the (extremely) rare occasion I need bluetooth support. For some reason I can’t get the integrated bluetooth modem to work with Fedora. The GeForce Go has caused me nothing but problems. Nvidia’s normal drivers won’t work with the Go series from Dell, I have to get the drivers directly from Dell…and they are flaky.

Anyway, I recently reloaded my laptop and let it update everything to the newest available. Unfortunately, at some point, suspend stopped working. I’m not sure when it was (it applied ~ 300 updates), but it stopped. Well, it didn’t exactly stop working…it still suspends, once, after which the monitor refuses to work. I can still ssh in, and everything seems to be functioning normally, but the monitor doesn’t work. Which makes a laptop very useless.

So, since I’ve reloaded linux a number of times, and it seems each time I forget what I did to fix it, I’m documenting it for myself, and posterity.

Read more