Update 2011-07-10: Due to a template export error with Cacti, the import was failing for a lot of people. I apologize for taking so long to fix the templates, however they should be fixed now. Thank you to everyone who pointed out the errors and the fix in the comments.
I have made no secret that I use two applications daily to monitor my infrastructure: Nagios and Cacti. I have created a fair number of scripts (and hopefully publishing more soon) to help Nagios monitor the different parts of the infrastructure, however I haven’t published many of my Cacti scripts previously.
One of the most useful is the config that I use to monitor the different protocol stats for volumes. I created an indexed query so that the single script, and accompanying XML file, are capable of monitoring all the volumes, and I can select which graphs to create for each volume. The polling script is loosely based off of the multi-protocol realtime volume statistics script that I created some time ago.
Included in the templates are graphs for FCP and SAN operations, however I have none of those on my filer, so I have no graphs to show you.
These are especially useful for volumes that have multiple types of access happening. For example, one of the systems that we have provides home directories to some users has both NFS and CIFS access enabled. It is extremely helpful to see latency for each of the protocols as it can help diagnose certain errors…for example, our NIS domain had an error at one point that was causing authentication/authorization to be extremely slow, by monitoring the NFS latency, we helped narrow down the problem.
Because it is an indexed script query, you can select volumes to have each type of graph created for. This makes it easy to select the volumes that you want to see, for instance, NFS latency using the standard list of objects that Cacti provides.
The setup is fairly simple…you’ll need the XML file that describes the inputs and outputs that Cacti communicates to the script, the perl script itself, and import the graph templates. After placing the perl script in your
$cacti_path/scripts directory, edit it and make sure that the NetApp SDK files are available. I usually put them in perl’s main library path, but if you have them in the directory with your script(s) just make sure that a
use lib "/path/to/NetApp/sdk" is in the script at the top.
After placing the
na-cacti-volume-stats.xml file in
$path_cacti/resource/script_queries directory, the only other modification you should need to make is to put the username and password that will be used to connect to the NetApp(s) in the XML file. I don’t particularly like this, for one, it’s a security risk, and two, it’s very static. You are welcome to modify the perl script so that it handles authentication a different way, but due to how Cacti behaves and the information that it passes (or doesn’t pass, as the case may be), the only way I have found to provide credentials is outside of the Cacti interface.
Anyway, that irritation aside, make sure that the permissions to the file are tightened such that only the Cacti user can access it, which helps to mitigate the security risk. The second part of mitigation is to ensure that the user which connects and polls the NetApp has limited access. You should not (SHOULD NOT!) be using root, or any other user in the Administrators group, to connect. The user you use to connect doesn’t need to modify anything, so they shouldn’t have those role based accesses enabled.
That should get you started. If you have any issues with the script or templates, please let me know in the comments.