Nagios

From UGCS
Jump to: navigation, search

We currently use Nagios as a service monitor. It periodically checks nearly every service in UGCS, and makes sure that the service is at least responsive. It is currently running on dionysus.

Website

You can view the current status of nagios at https://dionysus.ugcs.caltech.edu/nagios3 It requires a valid UGCS login to access, since nagios's website is notorious for vulnerabilities. If you are a sysadmin and your name is in /etc/nagios3/cgi.conf at the appropriate places, you will be able to run commands from the website (things like ignoring problems or re-scheduling the next update of a host). See where current sysadmins are to see where to place your name (as username@UGCS.CALTECH.EDU)

check_ldap

check_ldap won't work unless you modify its plugin command. You need to modify /etc/nagios-plugins/config/ldap.conf so that it gives $HOSTNAME$.ugcs.caltech.edu instead of $HOSTADDRESS$. Otherwise it just won't work.


Nagios configurator

There is a set of scripts to automatically generate nagios service and hostgroups. They are in the configurator directory. The base python file is NagiosInfo.py. It contains two classes, Hostgroup and Servicegroup that contain information on the appropriate services.

A hostgroup is a bunch of hosts that are similar. A hostgroup can have classes (from configurator) added to it with hgroup.add_class(classname). You can also include/exclude specific machines by adding their names to the lists hgroup.includes or hgroup.excludes.

Personal tools