Gentoo - time Series Monitoring
1. Introduction
There are lots of tools to help in this regard. One thing that needs to be understood up front is that this howto does not touch on monitoring, so as to receive notification if something is exceeding a threshold. You will want to look toward Nagios for things like that.
We will use the following tools to accomplish this task:
| Name | Purpose |
|---|---|
| Net-SNMP | This will expose raw data like memory, cpu, and network access on the server. It exposes it using the standard snmp protocol. |
| RRDtool | This is both a database format, and set of tools designed to manipulate the database and produce very pretty graphs. |
| MRTG | We are going to use this as an snmp client that will query the servers and write to an rrdtool compliant database. |
Much of this howto was derived from the drwindows tutorial on the Gentoo Forums. REFERENCE
2. Install Net-SNMP
root# emerge -a net-snmp
This will install both an snmp server and client.
3. Configure Net-SNMP server
com2sec local 127.0.0.1/32 public
com2sec local 10.0.2.0/24 public
group MyROGroup v1 local
group MyROGroup v2c local
group MyROGroup usm local
view all included .1 80
access MyROGroup "" any noauth exact all none none
syslocation Cincinnati
syscontact John McFarlane
There are a few steps left to complete this step:
-
Start the Net-snmp publisher:
root# /etc/init.d/snmpd start -
Set Net-snmp to start on bootup:
root# rc-update add snmpd default -
Test to see what data is being exposed by Net-snmp:
root# snmpwalk -v 1 -c public HOSTNAME
4. Install MRTG
root# emerge -a mrtg
root# mkdir -p /etc/mrtg/devices
root# mkdir -p /etc/mrtg/graphs
5. Configure MRTG
-
Make the main MRTG configuration /etc/mrtg/mrtg.cfg
look like this:
WorkDir: /var/lib/mrtg Logdir: /var/log LogFormat: rrdtool RunAsDaemon: No LoadMIBs: /usr/share/snmp/mibs/UCD-SNMP-MIB.txt, /usr/share/snmp/mibs/TCP-MIB.txt, /usr/share/snmp/mibs/HOST-RESOURCES-MIB.txt -
Here are some tips on how to gather details you will need
in the next step.
root# snmptable -v 1 -c public `hostname` ifTable | cut -b-33 -
Create the first device configuration
/etc/mrtg/devices/HOSTNAME.inc:
Rinse and repeat for other devices you want to monitor. Here's an example of what you might use to monitor a router:Target[HOSTNAME.usrsys]: ssCpuRawUser.0&ssCpuRawSystem.0:public@HOSTNAME MaxBytes[HOSTNAME.usrsys]: 100 Title[HOSTNAME.usrsys]: CPU usr sys Target[HOSTNAME.idlenice]: ssCpuRawIdle.0&ssCpuRawNice.0:public@HOSTNAME MaxBytes[HOSTNAME.idlenice]: 100 Title[HOSTNAME.idlenice]: CPU idle nice Target[HOSTNAME.tcpopen]: tcpCurrEstab.0&tcpCurrEstab.0:public@HOSTNAME MaxBytes[HOSTNAME.tcpopen]: 1000000 Title[HOSTNAME.tcpopen]: Open TCP connections Options[HOSTNAME.tcpopen]: gauge Target[HOSTNAME.proc]: hrSystemProcesses.0&hrSystemProcesses.0:public@HOSTNAME MaxBytes[HOSTNAME.proc]: 1000 Title[HOSTNAME.proc]: Number of running processes Options[HOSTNAME.proc]: gauge Target[HOSTNAME.freemem]: memTotalFree.0&memTotalFree.0:public@HOSTNAME MaxBytes[HOSTNAME.freemem]: 1000000 Title[HOSTNAME.freemem]: Free Memory Total Options[HOSTNAME.freemem]: gauge Target[HOSTNAME.ram.swap]: memAvailReal.0&memAvailSwap.0:public@HOSTNAME MaxBytes[HOSTNAME.ram.swap]: 456488 Title[HOSTNAME.ram.swap]: RAM vs swap Free Memory Options[HOSTNAME.ram.swap]: gauge Target[HOSTNAME.diskspercent]: .1.3.6.1.4.1.2021.9.1.9.1&.1.3.6.1.4.1.2021.9.1.9.2:public@HOSTNAME MaxBytes[HOSTNAME.diskspercent]: 100 Title[HOSTNAME.diskspercent]: disk usage percent Options[HOSTNAME.diskspercent]: gauge,nopercent Target[HOSTNAME.disks.usage]: .1.3.6.1.4.1.2021.9.1.7.1&.1.3.6.1.4.1.2021.9.1.7.2:public@HOSTNAME MaxBytes1[HOSTNAME.disks.usage]: 24691824 MaxBytes2[HOSTNAME.disks.usage]: 14048404 Title[HOSTNAME.disks.usage]: disk available totals Options[HOSTNAME.disks.usage]: gauge,nopercent Target[elk.traffic]: \eth0:public@elk: MaxBytes[elk.traffic]: 12500000 Title[elk.traffic]: 10.0.1.5 -- elkTarget[HOSTNAME.traffic]: 2:public@IP-ADDRESS MaxBytes1[HOSTNAME.traffic]: 250000 MaxBytes2[HOSTNAME.traffic]: 125000 Title[HOSTNAME.traffic]: Cable Modem Traffic Analysis -
Include you device config(s) into the main MRTG configuration
by appending to the end of /etc/mrtg/mrtg.cfg:
Include: devices/HOSTNAME.inc #Include: devices/foobar.inc
6. Install RRDtool
root# echo net-analyzer/rrdtool perl > /etc/portage.package.use
Install RRDtool per the usual:
root# emerge -a rrdtool
7. Initialize the databases using RRDtool
root# rrdtool create /var/lib/mrtg/`hostname`.disks.usage.rrd \
--start `date +"%s"` \
DS:ds0:GAUGE:600:0:24691824 \
DS:ds1:GAUGE:600:0:14048404 \
--step 300 \
RRA:AVERAGE:0.5:1:800 \
RRA:AVERAGE:0.5:6:800 \
RRA:AVERAGE:0.5:24:800 \
RRA:AVERAGE:0.5:288:800 \
RRA:MIN:0.5:1:800 \
RRA:MIN:0.5:6:800 \
RRA:MIN:0.5:24:800 \
RRA:MIN:0.5:288:800 \
RRA:MAX:0.5:1:800 \
RRA:MAX:0.5:6:800 \
RRA:MAX:0.5:24:800 \
RRA:MAX:0.5:288:800
root# rrdtool create /var/lib/mrtg/`hostname`.ram.swap.rrd \
--start `date +"%s"` \
DS:ds0:GAUGE:600:0:1034728 \
DS:ds1:GAUGE:600:0:506036 \
--step 300 \
RRA:AVERAGE:0.5:1:800 \
RRA:AVERAGE:0.5:6:800 \
RRA:AVERAGE:0.5:24:800 \
RRA:AVERAGE:0.5:288:800 \
RRA:MIN:0.5:1:800 \
RRA:MIN:0.5:6:800 \
RRA:MIN:0.5:24:800 \
RRA:MIN:0.5:288:800 \
RRA:MAX:0.5:1:800 \
RRA:MAX:0.5:6:800 \
RRA:MAX:0.5:24:800 \
RRA:MAX:0.5:288:800
root# rrdtool create /var/lib/mrtg/`hostname`.freemem.rrd \
--start `date +"%s"` \
DS:ds0:GAUGE:600:0:1034728 \
DS:ds1:GAUGE:600:0:506036 \
--step 300 \
RRA:AVERAGE:0.5:1:800 \
RRA:AVERAGE:0.5:6:800 \
RRA:AVERAGE:0.5:24:800 \
RRA:AVERAGE:0.5:288:800 \
RRA:MIN:0.5:1:800 \
RRA:MIN:0.5:6:800 \
RRA:MIN:0.5:24:800 \
RRA:MIN:0.5:288:800 \
RRA:MAX:0.5:1:800 \
RRA:MAX:0.5:6:800 \
RRA:MAX:0.5:24:800 \
RRA:MAX:0.5:288:800
root# rrdtool create /var/lib/mrtg/`hostname`.traffic.rrd \
--start `date +"%s"` \
DS:ds0:COUNTER:600:0:50000 \
DS:ds1:COUNTER:600:0:25000 \
--step 300 \
RRA:AVERAGE:0.5:1:800 \
RRA:AVERAGE:0.5:6:800 \
RRA:AVERAGE:0.5:24:800 \
RRA:AVERAGE:0.5:288:800 \
RRA:MIN:0.5:1:800 \
RRA:MIN:0.5:6:800 \
RRA:MIN:0.5:24:800 \
RRA:MIN:0.5:288:800 \
RRA:MAX:0.5:1:800 \
RRA:MAX:0.5:6:800 \
RRA:MAX:0.5:24:800 \
RRA:MAX:0.5:288:800
8. Populate the rddtool databases
root# mrtg /etc/mrtg/mrtg.cfg
9. Create bash scripts to generate graphs
-
CPU /etc/mrtg/graphs/HOSTNAME-cpu.bash:
#!/bin/bash #Generates a CPU info graph ########################### HOST=elk case $1 in day) INTERVAL=86400;; week) INTERVAL=604800;; month) INTERVAL=2678400;; year) INTERVAL=31622400;; *) INTERVAL=86400;; esac if [ $INTERVAL == 86400 ]; then INTERVALSTR="day" else INTERVALSTR="$1" fi echo Generating cpu.percent.${HOST}.$INTERVALSTR.png echo Using $INTERVAL interval rrdtool graph /rf/blobs/rockfloat/rrdtool/cpu.percent.${HOST}.$INTERVALSTR.png \ -s -$INTERVAL \ -a PNG \ -z \ DEF:user=/var/lib/mrtg/${HOST}.usrsys.rrd:ds0:AVERAGE \ DEF:system=/var/lib/mrtg/${HOST}.usrsys.rrd:ds1:AVERAGE \ DEF:idle=/var/lib/mrtg/${HOST}.idlenice.rrd:ds0:AVERAGE \ DEF:nice=/var/lib/mrtg/${HOST}.idlenice.rrd:ds1:AVERAGE \ "CDEF:total=100,idle,-" \ COMMENT:" Max Avg Current\n" \ AREA:system#FF4000:"System " \ GPRINT:system:MAX:'%7.2lf %%' \ GPRINT:system:AVERAGE:"%7.2lf %%" \ GPRINT:system:LAST:"%7.2lf %%\n" \ STACK:user#0080FF:"User " \ GPRINT:user:MAX:'%7.2lf %%' \ GPRINT:user:AVERAGE:"%7.2lf %%" \ GPRINT:user:LAST:"%7.2lf %%\n" \ STACK:nice#00FFFF:"Nice " \ GPRINT:nice:MAX:'%7.2lf %%' \ GPRINT:nice:AVERAGE:"%7.2lf %%" \ GPRINT:nice:LAST:"%7.2lf %%\n" \ LINE1:total#008080:"CPU " \ GPRINT:total:MAX:'%7.2lf %%' \ GPRINT:total:AVERAGE:"%7.2lf %%" \ GPRINT:total:LAST:"%7.2lf %%\n" \ -v "%" -t "CPU usage - $INTERVALSTR" -l 0root# chmod 755 /etc/mrtg/graphs/`hostname`.bash
10. Generate all the graphs
#!/bin/bash
echo "The graphs are likely running now too.. give them a sec to finish"
sleep 30
cd /etc/mrtg/graphs
for file in *.bash; do
echo 'Running ${file}...'
./$file
echo ''
done
Actually generate the graphs:
root# chmod 755 /etc/mrtg/gen-graphs.bash
root# /etc/mrtg/gen-graphs.bash
This document was originally created on 10/5/2005
Disclaimer:
This page is not endorsed by gentoo.org or any other cool
cats. Any information provided in this document is to be used
at your own risk.