Nagios (www.nagios.org) is a system monitoring tool. It alerts the system administrator when something breaks on his machines. Tirex comes with a set of plugins for Nagios that help you monitor your Tirex system.
There are two Nagios plugins for Tirex monitoring:
- Script that checks whether the Tirex master has recently updated the status message in shared memory. This is almost like checking whether the master process is running, but in case the master process should lock up, this test would catch that whereas a simple process list test would not. This script has no parameters and will either issue an "OK" or a "CRITICAL" status message.
- Script that checks the total Tirex queue size. Parameters are:
-w <n> issue WARNING if queue larger than <n> -c <n> issue CRITICAL if queue is larger than <n>
What to monitor
In addition to running the special Tirex Nagios plugins explained above, you should probably
- check that tirex-backend-manager is running with the check_procs plugin.
- check for disk space on the disk where your tiles and database are.
as well as the usual checks for system health.
If you are using Debian or Ubuntu, install the tirex-nagios-plugin module.
If you are installing from source, run make install-nagios.
The Nagios plugins will be installed into /usr/lib/nagios/plugins/, the configuration files in /etc/nagios/nrpe.d/.
The tirex-health plugins does not need configuration. Change the configuration of the nagios-queue-size plugin in /etc/nagios/nrpe.d/tirex-queue-size.cfg.