Aug 082012
 

There may be times when you need to forward alerts from one Nagios monitoring system to a second, possibly to a centralised security group who wants oversight over how you’re doing, or your first line support who needs first contact. Fortunately, the Nagios NSCA (Nagios Service Check Acceptor) can be used as a mechanism for doing just this.

I won’t go into too much detail here about NCSA – you’ll need to read the documentation here , but I’ll offer a very brief primer.

NSCA

Unlike NRPE, in which an agent gets polled by the Nagios server, NSCA is comprised of an agent which sends alerts to the server in reaction to events. The NSCA server listens for these events and then passes them to the Nagios server. It is therefore referred to as “passive” monitoring because Nagios is not actively checking.

The mechanism for this is that the NSCA client is a binary executable called send_nsca which connects to the NSCA daemon (nsca, listening on port 5667). Encryption is optional, and of varying security, and both agent and server have a configuration file for setting this. The client send_nsca program sends an alert string to the NSCA daemon which parses it and writes it to the Nagios spool file (a “named pipe” which Nagios watches constantly). As long as the Nagios server recognises the host and service that are specified in the alert (that is, they’re defined in its configuration), then the alert will be registered in Nagios and displayed in the GUI.

In order to use this process for alarm forwarding from a slave Nagios server to an upstream master Nagios, the idea is to use the NCSA client (send_nsca) as a notification command, and the upstream Nagios master as the contact.

The rest of this post will elaborate on these steps.

Configuring Alarm Forwarding

First, a high level description of the process of alarm forwarding using NSCA. In this example, the Nagios system that receives the forwarded alarm is called “OPSMON”.

  1. Alarm event occurs in Nagios due to multiple failures
  2. Notification is triggered
  3. “OPSMON” is one of the alarm contacts, and its service_notification_command is a send_nsca invocation.
  4. The NSCA client, using send_nsca, forwards a NSCA string containing the alarm information to the NSCA server, “OPSMON”.
  5. “OPSMON” is configured to receive alerts from the client host (on which the alarm was detected) and displays the alert accordingly.

In order to implement these operations, first NSCA must be installed and configured – the client and server. For brevity, I’ll assume that the NSCA server is already configured.

NSCA client

The NSCA client is installed on the Nagios host from which alerts will first be raised, and which will forward these alerts on to OPSMON.

Install the NSCA client:

  # yum -y install nsca-client  (Centos/RedHat)
  # apt-get install nsca-client (Ubuntu)

This will create a configuration file called /etc/nagios/send_nsca.cfg. Ensure that this is readable by the nagios user, but preferably not by anyone else, since it will contain a password, if you’re using password encryption.

  # chown nagios:nagios /etc/nagios/send_nsca.cfg
  # chmod 600 /etc/nagios/send_nsca.cfg

You will need to set these values to match whatever is set in the /etc/nagios/nsca.cfg file on the OPSMON server:

  password=
  encryption_method=

Testing NSCA

This can be tested by sending a dummy alert. Check in the Nagios log file (or syslog) on OPSMON to see that the message arrives.

   printf "%s\t%s\t%s\t%s\n" "client-host" "Dummy Service" "2" "Run for it!" | \
       /usr/sbin/send_nsca -H opsmon-host.example.com -c /etc/nagios/send_nsca.cfg

The fields in the text string passed to send_nsca are listed in the configuration section below and will become more clear.

Note that if this was a real alert, then the OPSMON Nagios would need its configuration to contain both a host defined with the name “client-host” and a service of “Dummy Service”. The text of these would need to match exactly.

Configure Nagios to Forward Alarms

Create the following Nagios objects in your configuration – either in one file, or individual files, depending on your setup.

Alert Forwarding Notification Commands

These commands, when executed by Nagios, will use the event environment data (macros) to send an NSCA message to the OPSMON server for service or host events. I like to use the tee command to also send the same text to a log file while debugging.

define command {
   command_name   notify-service-by-nsca
   command_line   /usr/bin/printf "%s\t%s\t%s\t%s\n" "$HOSTNAME$" "$SERVICEDESC$" "$SERVICESTATEID$"    "$SERVICEOUTPUT$|$SERVICEPERFDATA$" | tee -a /tmp/service_alert.log | /usr/sbin/send_nsca -H    $CONTACTADDRESS1$ -c /etc/nagios/send_nsca.cfg 
}

define command {
   command_name   notify-host-by-nsca
   command_line   /usr/bin/printf "%s\t%s\t%s\n" "$HOSTNAME$" "$HOSTSTATEID$" "$HOSTOUTPUT$" | /usr/sbin/send_nsca -H $CONTACTADDRESS1$ -c /etc/nagios/send_nsca.cfg 
}

The variables written like $HOSTNAME$ are what are known as Nagios macros, and during any given event will contain values which relate to the current host and service. Of particular note is $CONTACTADDRESS1$ which gets set in the contact definition, described below.

Contact for Receiving Alerts

Because alerts are being forwarded by means of Notifications, a “Contact” needs to be created to which these notifications will be sent

define contact {
   contact_name OPSMON
   service_notification_period 24x7
   host_notification_period 24x7
   service_notification_options w,u,c,r,f,s 
       ; all service states, flapping events and scheduled downtime events
   host_notification_options d,u,r,f,s 
       ; all host states, flapping events, and scheduled downtime events
   service_notification_commands notify-service-by-nsca
   host_notification_commands notify-host-by-nsca
   address1 opsmon-host.example.com
}

Use the Contact for Service Notifications

The contact OPSMON can now be referenced in service and host definitions. This means that any alert on the service will be sent to the OPSMON contact (the server opsmon-host.example.com), using the send_nsca command.

define service {
   host_name dummy-host
   use generic-service
   check_interval 1
   retry_interval 1
   max_check_attempts 2
   contacts OPSMON
}

Configure Nagios to Receive NSCA alerts

The upstream monitoring server (which I’ve been referring to as OPSMON) is configured to receive alerts passively – that is, it’s not actively polling its checks, it’s receiving events as they happen.

To configure a service to be passive, set in the service definition or template passive_check_enabled to 1, and active_checks_enabled to 0.

Note that for any service to be monitored in the upstream server, the hostname and service description must match exactly what is being sent via the send_nsca command.

This post is a brief example, and is not meant to be exhaustive. It’s hoped that it will provide a starting point for increasing the functions of a Nagios setup.

More information on the Nagios configuration files and object definitions can be found at:
http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html


Matt Parsons is a freelance Linux specialist who has designed, built and supported Unix and Linux systems in the finance, telecommunications and media industries.

He lives and works in London.

  One Response to “Nagios alert forwarding with NSCA”

  1. […] in advance for any help you can offer. I did find this link which is a decent overview and has a link within to an implementation guide but it’s for […]

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>