NetApp’s SnapManager for Virtual Infrastructure (SMVI) is a great product, but it’s messy. If it encounters the any error, it seemingly forgets to delete the virtual machine snapshots from the Virtual Infrastructure before dying.
To prevent many orphans (I’ve seen as many as 20 on a single virtual machine) from happening, I created a quick Nagios check that simply alerts when it sees them.
This script is very elementary. It very simply uses a regex to check for any snapshots that match the default SMVI naming convention. For each one it finds, a counter is incremented. If any are found, the script returns an error to Nagios, which causes an alert to be sent.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
#!/usr/bin/perl -w # # check_vi_smvi_snapshots.pl - written by Andrew Sullivan, 2010-06-16 # # Please report bugs and request improvements at http://get-admin.com/blog/?p=1059 # # A simple script to look for snapshots that match the name pattern that smvi uses. # We are merely pulling a list of all snapshots, searching for the string "smvi" in # the name, if it's found, we return a warning condition. This could lead to a # "false" positive if it runs while a snapshot series is still ongoing, but since # the smvi snaps should be very short lived the condidition will not last unless # the snap is left. # # Example: # ./check_vi_smvi_snapshots.pl --server your.esx.host --username you --password secret # use strict; use warnings; use FindBin; use lib "$FindBin::Bin/../"; use VMware::VIRuntime; # substitute the location of your nagios perl library use lib "/usr/lib64/nagios/plugins"; use utils qw(%ERRORS); Opts::parse(); Opts::validate(); Util::connect(); main(); Util::disconnect(); sub main { # the number of smvi snapshots my $smviSnaps = 0; # for setting the type of exit we want my $exitCondition = ""; # we need MORs for each of the VMs on the host my $VMs = Vim::find_entity_views( view_type => 'VirtualMachine' ); foreach my $vm (@$VMs) { if ($vm->snapshot) { foreach my $childSnapshot (@{$vm->snapshot->snapshotInfo->rootSnapshotList}) { $smviSnaps += getSnaps($childSnapshot); } } else { #print $vm->name . " has no snapshotsn"; } } if ($smviSnaps > 0) { print "WARNING - " . $smviSnaps . " SMVI snapshots exist.n"; $exitCondition = "WARNING"; } else { print "OK - No SMVI snapshots exist.n"; $exitCondition = "OK"; } Util::disconnect(); exit $ERRORS{ $exitCondition }; } sub getSnaps { my ($snapshotTree) = @_; my $snapcount = 0; # uncomment for debugging #print "Found snap: " . $snapshotTree->{name} . "n"; if ( $snapshotTree->{name} =~ /smvi/ ) { $snapcount++; } if ($snapshotTree->childSnapshotList) { foreach my $childSnapshot (@{$snapshotTree->childSnapshotList}) { $snapcount += getSnaps($childSnapshot); } } return $snapcount; } |
I’ve set the check to execute once an hour in my environment, as I don’t feel that granularity finer than that is needed…an hour’s worth of change is ok for an SMVI snapshot for me.
Hi,
I’m in the phase of testing this script but I always receive this error:
Undefined subroutine &VirtualMachineSnapshotInfo::snapshotInfo called at ./check_smvi_snapshots.pl line 51
I have installed VMware-vSphere-Perl-SDK-4.1.0-254719.i386
Do you have an idea what’s the problem here?
Thanks in advance
Hi again,
I was able to find out why the script failed:
I changed line 51 from:
foreach my $childSnapshot (@{$vm->snapshot->snapshotinfo->rootsnapshotlist}) {
to:
foreach my $childSnapshot (@{$vm->snapshot->rootSnapshotList}) {
Now it works fine…
Johann
thanx both
works perfectly
im a bit confused. how do legit smvi snaps not generate false positives on snaps > 0?
what exactly in the snap list is telling of an orphaned by smvi snapshot?
@Nick,
SMVI runs once a day in my environment. Let’s say that time is at 0200 in the morning. So, at 0800, any SMVI snapshot that is still present is there erroneously.
If you run SMVI more frequently than that, you will have to add some logic to check for the age of the snapshot and compare that to an acceptable length of time for it to exist.
Hope that helps,
Andrew