Monitoring CPU, RAM, Network with collectd Linux 06.05.2014

collectd is a UNIX-daemon that collects, transfers and stores performance data of computers and network equipment. The acquired data is meant to help system administrators maintain an overview over available resources to detect existing or looming bottlenecks (source).

collectd uses a modular design: the daemon itself only implements infrastructure for filtering and relaying data as well as auxiliary functions and requires very few resources.

Data acquisition and storage is handled by plugins in the form of shared objects. Data acquisition plug-ins, called "read plugins" in collectd's documentation so called "write plug-ins" offer the possibility to store the collected data on disk using RRD- or CSV-files, or to send data over the network to a remote instance of the daemon.

Install

We'll use collectd for data gathering and rrdtool for visualization.

There are packages in package repositories for Arch Linux and Ubuntu.

# arch
yaourt -S collectd rrdtool

# ubuntu
sudo apt-get install collectd rrdtool

Configuration

The configuration lie in /etc/collectd.conf.

For each plugin, there is a LoadPlugin line in the configuration. Almost all of those lines are commented out in order to keep the default configuration lean. By default the following plugins are enabled: CPU, Interface, Load, and Memory.

Simple config file

# egrep -v '^$|^#' /etc/collectd/collectd.conf

FQDNLookup true
LoadPlugin syslog
<Plugin syslog>
        LogLevel info
</Plugin>
LoadPlugin cpu
LoadPlugin df
LoadPlugin interface
LoadPlugin load
LoadPlugin memory
LoadPlugin processes
LoadPlugin rrdtool
LoadPlugin swap
LoadPlugin users
<Plugin rrdtool>
        DataDir "/var/lib/collectd/rrd"
</Plugin>
<Plugin df>
        Device "/dev/sda1"
        MountPoint "/"
        FSType "ext4"
</Plugin>
Include "/etc/collectd/filters.conf"
Include "/etc/collectd/thresholds.conf"

If you're done configuring, you need to (re-)start the daemon.

# start in arch 
sudo systemctl start collectd.service

# restart in arch
sudo systemctl restart collectd.service

# enable in arch
sudo systemctl restart collectd.service

# start in ubuntu
sudo service collectd start

# restart in ubuntu
sudo service collectd restart

Visualization

There are many options to present data from collectd. You can use front-ends (collectd-web, graphite, visage) or generate png files with rrdtool and include in html file which is accessible from outside.

First option is to use front-end. I'm going to show Collectd-Web.

# download
git clone git://github.com/httpdss/collectd-web.git

# check dependency
cd collectd-web
./check_deps.sh

# make sure collectd looks under the correct location
# cat /etc/collectd/collection.conf 

datadir: "/var/lib/collectd/rrd/" 
libdir: "/usr/lib/collectd/"

After that you should copy collectd-web to public directory.

sudo cp -r *.* /var/www/example.com/public/

Second option is to use rrdtool for generation png file with diagram. After generation we can use images in any html file as source for img tag.

Example of rrdtool script you can get here.

Add it to cron as usual

# sudo crontab -e

*/15 * * * * root /path/to/script.sh > /dev/null 2>&1

You can check your network speed with speedtest-cli.

Notification

Collectd can inform you if something is strange. For example, it can send email if CPU load is more than 80% or RAM utilization is more than 90% and so on.

First, we need activate notify_email plugin and set recipient.

# sudo vim /etc/collectd/collectd.conf

LoadPlugin "threshold"
LoadPlugin notify_email

<Plugin notify_email>
   SMTPServer "stmp.example.com"
   SMTPPort 25
   SMTPUser "user@example.com"
   SMTPPassword "password"
   From "user@example.com"
   Subject "[collectd] %s on %s!"
   Recipient "recipient@example.com"
</Plugin>

Second, set up threshold

# sudo vim /etc/collectd/thresholds.conf

<Threshold>
<Type "cpu">
     Instance "user"
     WarningMax 85
     Hits 1
</Type>
</Threshold>

Last, restart collectd

# restart in arch
sudo systemctl restart collectd.service

# restart in ubuntu
sudo service collectd restart

You can check CPU overload with cpuburn.

How to write custom plugin