[Nagiosplug-help] Inexplicable pattern match failure of check_http since update
Ralph.Grothe at itdz-berlin.de
Ralph.Grothe at itdz-berlin.de
Wed Apr 1 11:43:53 CEST 2009
Hello Nagios Plug-in List Members,
I recently updated my Nagios Server to from 2.9 to 3.0.6 release and the latest stable release of the Nagios Plug-ins.
After a day of only minor adaptations of my host/service/check command definitions the vast majority of them is now running fine,
and it was a rather seamless update.
Yet, I still encounter some to me inexplicable issue with one of my checks of a Tomcat server.
When the check is run by the nagios scheduler I always get a HARD CRITICAL error owe to a pattern mismatch
of my glob/regex as defined by the check command and service definition according to what I expect to receive as HTTP response
from the checked Tomcat Manager container.
[nagios at nagsaz:~]
$ grep aDISWeb\ Tomcat /var/log/nagios/nagios.log
[1238536800] CURRENT SERVICE STATE: uranus;aDISWeb Tomcat;CRITICAL;HARD;5;HTTP CRITICAL - pattern not found
[nagios at nagsaz:~]
$ perl -le 'print scalar localtime 1238536800'
Wed Apr 1 00:00:00 2009
Whereas, when I run the check from the shell as user nagios on my Nagios server with exactly the set of arguments an macros
as passed to the service, all is OK.
[nagios at nagsaz:~]
$ /opt/nagios/plugins/libexec/check_http -H uranus -p 8081 -u /manager/sessions?path=/aDISWeb -a bogususer:boguspasswd -l -r '^OK.*:\s*[0-9]*\sSitzungen'
HTTP OK HTTP/1.1 200 OK - 0,007 second response time |time=0,007391s;;;0,000000 size=398B;;;0
As said, the arguments passed to the above manual check are (apart from the obfuscated credentials here) exactly the same as in the
definitions of the service and the check command.
Another thing that changed together with my Nagios and Plug-ins update for this service was that I was forced to set the global i18n on the host
that runs this service to de_DE.utf-8 (from en_US.iso885915 before).
The effect that it had on this check was that the HTTP response now contains German text in the HTML markup, so the only difference I had to cater for
was to replace "sessions" by "Sitzungen" within my pattern.
The version of the used check_http plug-in is:
[nagios at nagsaz:~]
$ /opt/nagios/plugins/libexec/check_http -V
check_http v2053 (nagios-plugins 1.4.13)
Beyond this I noticed that somehow external commands related to this service check don't seem to be executed by the scheduler.
[nagios at nagsaz:~]
$ printf "[%lu] SCHEDULE_SVC_CHECK;uranus;aDISWeb Tomcat;%lu\n" $(date +%s) $(($(date +%s)+120))>/opt/nagios/var/rw/nagios.cmd
[nagios at nagsaz:~]
$ grep aDISWeb\ Tomcat /var/log/nagios/nagios.log
[1238536800] CURRENT SERVICE STATE: uranus;aDISWeb Tomcat;CRITICAL;HARD;5;HTTP CRITICAL - pattern not found
[1238578049] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;uranus;aDISWeb Tomcat;1238578169
[nagios at nagsaz:~]
$ perl -le 'print scalar localtime 1238578169'
Wed Apr 1 11:29:29 2009
[nagios at nagsaz:~]
$ date
Wed Apr 1 11:27:57 CEST 2009
...two minutes later (or even hours later, makes no difference)
[nagios at nagsaz:~]
$ date
Wed Apr 1 11:30:39 CEST 2009
[nagios at nagsaz:~]
$ grep aDISWeb\ Tomcat /var/log/nagios/nagios.log
[1238536800] CURRENT SERVICE STATE: uranus;aDISWeb Tomcat;CRITICAL;HARD;5;HTTP CRITICAL - pattern not found
[1238578049] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;uranus;aDISWeb Tomcat;1238578169
It's no difference whether I reschedule the check via the web interface or pipe it into the FIFO as done above.
So why does an externally enforced rescheduling of the check then doesn't get executed?
Is this due to some clever scheduling algorithm that I have missed here so far, or some neglected config setting?
Many thanx for your patience
Ralph
More information about the Help
mailing list