Mini HOWTO #2: system monitoring via SNMP
Goal: We want to monitor system resources such as CPU utilization, memory utilization, disk space, processes, system load via SNMP
Solution: Install and configure Net-SNMP
1. Install Net-SNMP
2a. Keep things simple with access control; the following entries can be defined (as opposed to more complicated com2sec, group etc.):
# rwuser: a SNMPv3 read-write user
# arguments: user [noauth|auth|priv] [restriction_oid]
rwuser topsecretv3
# rouser: a SNMPv3 read-only user
# arguments: user [noauth|auth|priv] [restriction_oid]
rouser topsecretv3_ro
# rocommunity: a SNMPv1/SNMPv2c read-only access community name
# arguments: community [default|hostname|network/bits] [oid]
rocommunity topsecret_ro
# rwcommunity: a SNMPv1/SNMPv2c read-write access community name
# arguments: community [default|hostname|network/bits] [oid]
rwcommunity topsecret
2b. Disk space can be monitored by adding entries to the 'disk' section. Example:
disk /
disk /boot
disk /usr
2c. Processes can be monitored by adding entries to the 'proc' section. Example:
proc java
proc postmaster
proc mysqld
2d. System load can be monitored by adding entries to the 'load' section. Example:
load 5 5 5
2e. The EXAMPLE.conf file in the source directory shows more capabilities of the SNMP agent (you can run executables/scripts and return one line of output and an exit code)
3. Start up the SNMP daemon (agent) by running /usr/local/sbin/snmpd. If you want snmpd to start up automatically at boot time, add the line '/usr/local/sbin/snmpd' to /etc/rc.d/rc.local on Red Hat systems, or equivalent on other flavors of Unix
3a. The agent logs to /var/log/snmpd.log (for more detailed debugging info, start the agent with the -D flag)
4. On the SNMP monitoring host, use snmpget to query the SNMP agent running on the target host. The trick here is to know which OIDs to use when you query the agent.
Examples:
Get available disk space for / on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.9.1.7.1
(this will return available disk space for the first entry in the 'disk' section of snmpd.conf; replace 1 with n for the nth entry)
Get the number of java processes running on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.2.1.5.1
(replace 1 at the end with n for the nth entry in the 'proc' section)
Get the 1-minute system load on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.10.1.3.1
Get the 5-minute system load on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.10.1.3.2
Get the 15-minute system load on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.10.1.3.3
Get various CPU utilization metrics on the target host via snmpwalk:
snmpwalk -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.11
Sample output:
UCD-SNMP-MIB::ssIndex.0 = INTEGER: 1
UCD-SNMP-MIB::ssErrorName.0 = STRING: systemStats
UCD-SNMP-MIB::ssSwapIn.0 = INTEGER: 0
UCD-SNMP-MIB::ssSwapOut.0 = INTEGER: 0
UCD-SNMP-MIB::ssIOSent.0 = INTEGER: 1
UCD-SNMP-MIB::ssIOReceive.0 = INTEGER: 5
UCD-SNMP-MIB::ssSysInterrupts.0 = INTEGER: 5
UCD-SNMP-MIB::ssSysContext.0 = INTEGER: 8
UCD-SNMP-MIB::ssCpuUser.0 = INTEGER: 0
UCD-SNMP-MIB::ssCpuSystem.0 = INTEGER: 0
UCD-SNMP-MIB::ssCpuIdle.0 = INTEGER: 99
UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 1007102
UCD-SNMP-MIB::ssCpuRawNice.0 = Counter32: 3879
UCD-SNMP-MIB::ssCpuRawSystem.0 = Counter32: 544737
UCD-SNMP-MIB::ssCpuRawIdle.0 = Counter32: 238396576
To retrieve a specific metric, for example the number of interrupts, you would do:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.11.7.0
(we append 7.0 to the OID that we used in snmpwalk, because ssSysInterrupts is the 7th variable in the snmpwalk output)
Get various memory utilization metrics on the target host via snmpwalk:
snmpwalk -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.4
Sample output:
UCD-SNMP-MIB::memIndex.0 = INTEGER: 0
UCD-SNMP-MIB::memErrorName.0 = STRING: swap
UCD-SNMP-MIB::memTotalSwap.0 = INTEGER: 2048276
UCD-SNMP-MIB::memAvailSwap.0 = INTEGER: 2005604
UCD-SNMP-MIB::memTotalReal.0 = INTEGER: 998560
UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 89896
UCD-SNMP-MIB::memTotalFree.0 = INTEGER: 2095500
UCD-SNMP-MIB::memMinimumSwap.0 = INTEGER: 16000
UCD-SNMP-MIB::memShared.0 = INTEGER: 0
UCD-SNMP-MIB::memBuffer.0 = INTEGER: 234884
UCD-SNMP-MIB::memCached.0 = INTEGER: 459016
UCD-SNMP-MIB::memSwapError.0 = INTEGER: 0
UCD-SNMP-MIB::memSwapErrorMsg.0 = STRING:
To retrieve a specific metric, for example the amount of available swap space, you would do:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.4.4.0
(we append 4.0 to the OID that we used in snmpwalk, because memAvailSwap is the 4th variable in the snmpwalk output)
Note: for CPU and memory stats, you don't need to add any special directives in the snmpd.conf configuration file
Solution: Install and configure Net-SNMP
1. Install Net-SNMP
- if installing from source, the configuration file snmpd.conf will go into /usr/local/share/snmp
- by default there is no configuration file; it can be generated via the snmpconf Perl utility
2a. Keep things simple with access control; the following entries can be defined (as opposed to more complicated com2sec, group etc.):
# rwuser: a SNMPv3 read-write user
# arguments: user [noauth|auth|priv] [restriction_oid]
rwuser topsecretv3
# rouser: a SNMPv3 read-only user
# arguments: user [noauth|auth|priv] [restriction_oid]
rouser topsecretv3_ro
# rocommunity: a SNMPv1/SNMPv2c read-only access community name
# arguments: community [default|hostname|network/bits] [oid]
rocommunity topsecret_ro
# rwcommunity: a SNMPv1/SNMPv2c read-write access community name
# arguments: community [default|hostname|network/bits] [oid]
rwcommunity topsecret
2b. Disk space can be monitored by adding entries to the 'disk' section. Example:
disk /
disk /boot
disk /usr
2c. Processes can be monitored by adding entries to the 'proc' section. Example:
proc java
proc postmaster
proc mysqld
2d. System load can be monitored by adding entries to the 'load' section. Example:
load 5 5 5
2e. The EXAMPLE.conf file in the source directory shows more capabilities of the SNMP agent (you can run executables/scripts and return one line of output and an exit code)
3. Start up the SNMP daemon (agent) by running /usr/local/sbin/snmpd. If you want snmpd to start up automatically at boot time, add the line '/usr/local/sbin/snmpd' to /etc/rc.d/rc.local on Red Hat systems, or equivalent on other flavors of Unix
3a. The agent logs to /var/log/snmpd.log (for more detailed debugging info, start the agent with the -D flag)
4. On the SNMP monitoring host, use snmpget to query the SNMP agent running on the target host. The trick here is to know which OIDs to use when you query the agent.
Examples:
Get available disk space for / on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.9.1.7.1
(this will return available disk space for the first entry in the 'disk' section of snmpd.conf; replace 1 with n for the nth entry)
Get the number of java processes running on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.2.1.5.1
(replace 1 at the end with n for the nth entry in the 'proc' section)
Get the 1-minute system load on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.10.1.3.1
Get the 5-minute system load on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.10.1.3.2
Get the 15-minute system load on the target host:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.10.1.3.3
Get various CPU utilization metrics on the target host via snmpwalk:
snmpwalk -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.11
Sample output:
UCD-SNMP-MIB::ssIndex.0 = INTEGER: 1
UCD-SNMP-MIB::ssErrorName.0 = STRING: systemStats
UCD-SNMP-MIB::ssSwapIn.0 = INTEGER: 0
UCD-SNMP-MIB::ssSwapOut.0 = INTEGER: 0
UCD-SNMP-MIB::ssIOSent.0 = INTEGER: 1
UCD-SNMP-MIB::ssIOReceive.0 = INTEGER: 5
UCD-SNMP-MIB::ssSysInterrupts.0 = INTEGER: 5
UCD-SNMP-MIB::ssSysContext.0 = INTEGER: 8
UCD-SNMP-MIB::ssCpuUser.0 = INTEGER: 0
UCD-SNMP-MIB::ssCpuSystem.0 = INTEGER: 0
UCD-SNMP-MIB::ssCpuIdle.0 = INTEGER: 99
UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 1007102
UCD-SNMP-MIB::ssCpuRawNice.0 = Counter32: 3879
UCD-SNMP-MIB::ssCpuRawSystem.0 = Counter32: 544737
UCD-SNMP-MIB::ssCpuRawIdle.0 = Counter32: 238396576
To retrieve a specific metric, for example the number of interrupts, you would do:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.11.7.0
(we append 7.0 to the OID that we used in snmpwalk, because ssSysInterrupts is the 7th variable in the snmpwalk output)
Get various memory utilization metrics on the target host via snmpwalk:
snmpwalk -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.4
Sample output:
UCD-SNMP-MIB::memIndex.0 = INTEGER: 0
UCD-SNMP-MIB::memErrorName.0 = STRING: swap
UCD-SNMP-MIB::memTotalSwap.0 = INTEGER: 2048276
UCD-SNMP-MIB::memAvailSwap.0 = INTEGER: 2005604
UCD-SNMP-MIB::memTotalReal.0 = INTEGER: 998560
UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 89896
UCD-SNMP-MIB::memTotalFree.0 = INTEGER: 2095500
UCD-SNMP-MIB::memMinimumSwap.0 = INTEGER: 16000
UCD-SNMP-MIB::memShared.0 = INTEGER: 0
UCD-SNMP-MIB::memBuffer.0 = INTEGER: 234884
UCD-SNMP-MIB::memCached.0 = INTEGER: 459016
UCD-SNMP-MIB::memSwapError.0 = INTEGER: 0
UCD-SNMP-MIB::memSwapErrorMsg.0 = STRING:
To retrieve a specific metric, for example the amount of available swap space, you would do:
snmpget -v 1 -c "community" target_name_or_ip .1.3.6.1.4.1.2021.4.4.0
(we append 4.0 to the OID that we used in snmpwalk, because memAvailSwap is the 4th variable in the snmpwalk output)
Note: for CPU and memory stats, you don't need to add any special directives in the snmpd.conf configuration file




4 Comments:
Great article, it's very hard to find this information anywhere. Are there any howto's on getting this same information sent out as SNMP traps (e.g. when CPU% or disk space gets above a certain level)?
By
Anonymous, at 6:36 AM
I would like to know how you can monitor the following process ...
perl -w /opt/aws/platform/admindaemon/script/awsconfclientd id
As we know that the Operating System recognises it, just as perl without arguments...
Any idea to solve this issue ??
My email is iuzcat@cantv.com.ve
Thanks and Regards.
By
simulate-unix, at 1:51 PM
Would it work if you called the perl command line inside a bash script and monitored the name of that script?
If not, I'd use a different system for monitoring processes -- you could ssh into the remote system via ssh with public keys, then do a ps and grep for the exact process name.
Grig
By
Grig Gheorghiu, at 2:10 PM
Great work, but i can't find /usr/local/share/snmp/snmp.conf, I'm using Debian, i try to find it with locate, but no luck.
SNMP is working, 'cause i use snmpwalk from another machine and i got a answer.
And where did you find, what MIB query to use, is system dependant?
PS: Sorry for my bad english, it is not my mother tongue.
By
Javier, at 7:23 AM
Post a Comment
Links to this post:
Create a Link
<< Home