I ran into an interesting problem while setting up cacti.
To start with, the Solaris-10 net-snmp in /usr/sfw will not report partition stats (used, max, free) for partitions that are not ufs. I noticed this a little while back with some vxfs filesystems at work but graphing them was filed as a low-priority project.
There is a workaround blogged at sysadmin.asyd.net where he indicates that you can have snmpd return disk percentages for zfs partitions.
After pondering that solution, I came to the conclusion that there are two issues not solved by this solution:
- You can arbitrarily create filesystems in zfs. To monitor them in cacti, you need to hand-manage your filesystem list in /etc/init.d/sma/snmpd.conf. After that, you would need to manually add/remove them from your cacti configuration. If you have a dynamic system with a dozen or more filesystems, it would be annoying. At any scale of 1+N servers, this becomes a management nightmare.
- You can only display %used in each filesystem. This figure can grow or shrink in a static filesystem via activity on other filesystems. Your filesystem availability is shared in a pool (zfs quota assignments minimize the fluctuations, but will not make them go away).
Given the zfs philosophy of “filesystems come and go”, it doesn’t make sense to try to plot all of them. If you have a home fileserver, you may have quite a few filesystems so that you can compartmentalize your data (as I have). Putting all of them into graphs in cacti will create a very busy page that’ll be mildly painful to scroll through.
The solution? Map the zpools instead — they’re (generally) tied to devices, so they’re less likely to be created and removed on a regular basis.
The concept is quite simple, take the output of something like this
: myserver; zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT data 1.81T 1.74T 70.4G 96% ONLINE - export 38.8G 13.9G 24.8G 35% ONLINE -
and twiddle it so that snmpd will digest and spit it out.
First, a little shell scripting. We want raw numbers so that we can graph them, so we need to get rid of those pesky non-numeric characters. Something along the lines of this:
#!/bin/ksh
export PATH=/usr/bin:/usr/sbin:/sbin
export LD_LIBRARY_PATH=/usr/lib
zpool list -H -o capacity ${1} | sed -e 's/%//g'
Then we put this in /etc/sma/snmp/snmpd.conf:
exec zpool-list.ksh /etc/sma/snmp/zpool-list.ksh export exec zpool-list.ksh /etc/sma/snmp/zpool-list.ksh data
Restart snmpd:
: myserver; sudo svcadm -v restart sma Action restart set for svc:/application/management/sma:default.
We can use snmpwalk to verify our output:
: myserver; snmpwalk -v 2c -c public localhost .1.3.6.1.4.1.2021.8 UCD-SNMP-MIB::extIndex.1 = INTEGER: 1 UCD-SNMP-MIB::extIndex.2 = INTEGER: 2 UCD-SNMP-MIB::extNames.1 = STRING: zpool-list.ksh UCD-SNMP-MIB::extNames.2 = STRING: zpool-list.ksh UCD-SNMP-MIB::extCommand.1 = STRING: /etc/sma/snmp/zpool-list.ksh export UCD-SNMP-MIB::extCommand.2 = STRING: /etc/sma/snmp/zpool-list.ksh data UCD-SNMP-MIB::extResult.1 = INTEGER: 0 UCD-SNMP-MIB::extResult.2 = INTEGER: 0 UCD-SNMP-MIB::extOutput.1 = STRING: 35 UCD-SNMP-MIB::extOutput.2 = STRING: 96 UCD-SNMP-MIB::extErrFix.1 = INTEGER: 0 UCD-SNMP-MIB::extErrFix.2 = INTEGER: 0 UCD-SNMP-MIB::extErrFixCmd.1 = STRING: UCD-SNMP-MIB::extErrFixCmd.2 = STRING:
If you notice in the above, we really only have one output line to work with. Therefore, I decided that %used of the zpool was sufficient, so long as I disabled autoscaling.
The cacti steps were fairly straightforward.
- Create “Data Source(s)” using the “SNMP – Generic OID Template”
- Create a “Graph Template” copying most settings from “Unix – Logged in Users”
- Create “Graph Object(s)”
- Associate (3) Graph Object(s) with your “Device”
Normally you can skip (2) and do (3) above using “SNMP – Generic OID Template”, but I ran into cacti bug 0001145 and had to create my own template. No sweat, really.
You can find the details to the above Cacti steps in forums.cacti.net.