[Nagiosplug-help] check_nagios -C problem
Franz, Jay
Jay.Franz at ssa.gov
Wed Jun 27 01:01:10 CEST 2012
As a wise admin once said, necessity is both a mother and motivator. The problem with the failure of the plug-in, to correctly identify the Nagios command by its full pathname, results from the options passed to the 'ps' command (revealed by running the plug-in in "verbose" mode). The 'ps' command and options appear to be hard-coded in the 'configure' script for AIX and HP-UX. Specifically, the resulting 'ps' options, '-el', do not display the full pathname of the executing process. Instead, they only show the command name itself.
For example:
# /usr/bin/ps -el | egrep "[n]agios"
2401 R 108 11743 1 0 152 20 e000000174822280 535 - ? 00:00 nagios
Versus:
# /usr/bin/ps -ef | egrep "[n]agios"
nagios 11743 1 0 18:11:24 ? 00:00 /opt/iexpress/nagios/bin/nagios -d /opt/iexpress/nagios/etc/nagios.cfg
As a result, passing any part of the pathname to the plug-in will generate a CRITICAL result.
So, our solution is pass only the command name, and to rename the plug-in, so that it will no longer match itself. A bit of a kludge, perhaps, but like most kludges, it does do the trick.
-----Original Message-----
From: Franz, Jay [mailto:Jay.Franz at ssa.gov]
Sent: Tuesday, June 26, 2012 13:41
To: 'nagiosplug-help at lists.sourceforge.net'
Subject: [Nagiosplug-help] check_nagios -C problem
We are in the process of setting up fail over monitoring for our existing Nagios server and are experiencing some problems with the 'check_nagios' plug-in. Specifically, it does not appear to recognize our full path command string. Instead, we are only able to make it work by stripping down the command path to its basename (i.e., '/opt/iexpress/nagios/bin/nagios' versus 'nagios'). Our OS, Nagios core, and plug-in versions follow, as well as the process status output of our Nagios command and the execution results from the 'check_nagios' plug-in. Any advice would be appreciated. Thanks.
--------------------
OS:
# uname -sr
HP-UX B.11.23
Nagios Core:
# /opt/iexpress/nagios/bin/nagios -v /opt/iexpress/nagios/etc/nagios.cfg | egrep "Nagios Core"
Nagios Core 3.2.3
Plugin:
# /usr/local/nagios/libexec/check_nagios --version
check_nagios v1.4.15 (nagios-plugins 1.4.15)
--------------------
# ps -ef | egrep "[/]opt/iexpress/nagios/bin/nagios"
nagios 9817 1 0 Jun 22 ? 05:34 /opt/iexpress/nagios/bin/nagios -d /opt/iexpress/nagios/etc/nagios.cfg
# /usr/local/nagios/libexec/check_nagios -e 60 -F /opt/iexpress/nagios/var/nagios.log -C /opt/iexpress/nagios/bin/nagios
NAGIOS CRITICAL: Could not locate a running Nagios process!
# /usr/local/nagios/libexec/check_nagios -e 60 -F /opt/iexpress/nagios/var/nagios.log -C nagios
NAGIOS OK: 2 processes, status log updated 1822 seconds ago
While the second iteration works, more or less, it will never return a CRITICAL status because it always matches against itself. That is, the 'check_nagios' script shows up in the list of processes when it executes.
For example, if we stop the Nagios server, the 'check_nagios' script still returns an OK status
# /sbin/init.d/nagios stop
Stopping nagios:
done.
# ps -ef | egrep "[/]opt/iexpress/nagios/bin/nagios"
<NO OUTPUT>
# ps -ef | egrep "[n]agios"
<NO OUTPUT>
# /usr/local/nagios/libexec/check_nagios -e 60 -F /opt/iexpress/nagios/var/nagios.log -C nagios
NAGIOS OK: 1 process, status log updated 15 seconds ago
Even if we reduce the expire window to 1, we never get more than a WARNING.
# /usr/local/nagios/libexec/check_nagios -e 60 -F /opt/iexpress/nagios/var/nagios.log -C nagios
NAGIOS OK: 1 process, status log updated 268 seconds ago
# /usr/local/nagios/libexec/check_nagios -e 1 -F /opt/iexpress/nagios/var/nagios.log -C nagios
NAGIOS WARNING: 1 process, status log updated 272 seconds ago
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Nagiosplug-help mailing list
Nagiosplug-help at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagiosplug-help
::: Please include plugins version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Help
mailing list