[Nagiosplug-help] check_icmp seems flapping - followon to RE: make of nagios-plugins-1.4.5 on AIX 4.3 fails
Andreas Ericsson
ae at op5.se
Thu Nov 30 14:00:33 CET 2006
Ralph.Grothe at itdz-berlin.de wrote:
> I am still seem to have some serious trouble with my build of the
> check_icmp plugin.
>
> Because the make was prematurely aborted (owe to the check_swap
> error)
> I manually chown-ed root of check_icmp and chmod-ed u+s
> because ICMP packet generation I assume requires root privileges.
>
> I then copied it into $USER1$ and there set a hard link to
> check_host.
>
> In my hosts.cfg template I defined a check-host-alive as default
> check_command
> that looks like this
>
> define command {
> command_name check-host-alive
> command_line $USER1$/check_host -H $HOSTADDRESS$ -t 15
> -c 10000
> }
>
>
> After a bit of further tweaking of my config files to reflect a
> hopefully cleaner overall layout
> I uncautiously started the new 2.5 nagios after all pre-flight
> checks were satisfied
> without prior disabling of host notifications.
>
> I then was shocked to realize that nagios was cheerfully churning
> out dozens of alert notification
> when the hosts' states changed from soft critical to hard
> critical.
> Only to minutes later relaps from hard critical to hard ok, and
> notifying about the recovery
> (because host notification_options of course included r in my
> template).
> This was kind of flip flopping for many hosts.
>
> I then ran check_host several times manually where I realized the
> following hanging:
>
>
> $ ~/libexec/check_host -H somehost
> mode: 1
> CRITICAL - somehost: rta nan, lost
> 100%|rta=0.000ms;1000.000;1000.000;0; pl=100%;
> 100;100;;
>
>
> But an instantly followed ping always returned the echo requests:
>
> $ ping -c 3 somehost
> PING somehost.somewhere.tld: (123.123.123.123): 56 data bytes
> 64 bytes from 123.123.123.123: icmp_seq=0 ttl=248 time=3 ms
> 64 bytes from 123.123.123.123: icmp_seq=1 ttl=248 time=3 ms
> 64 bytes from 123.123.123.123: icmp_seq=2 ttl=248 time=3 ms
>
> ----somehost.somewhere.tld PING Statistics----
> 3 packets transmitted, 3 packets received, 0% packet loss
> round-trip min/avg/max = 3/3/3 ms
>
>
> Now I am curiuous whether my compilation of check_icmp is ok?
>
You'd get this problem if you use an old check_icmp on a system that
handles process id's > 65535. In the old version, check_icmp didn't
recognize valid ICMP responses because the id-field used in the icmp
header is only 16 bits wide, so a 32-bit pid doesn't fit in it. This
would typically only happen when the pid of check_icmp is larger than
65535, which would explain the checks hopping between OK for a while and
non-OK for a while. Judging by "mode: 1" above, I'd say your check_icmp
is fairly old and needs to be upgraded. What version of the plugins are
you using?
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
More information about the Help
mailing list