[Nagiosplug-help] Bug in check_icmp/MODE_HOSTCHECK? (1.4.11)
Holger Weiss
holger at CIS.FU-Berlin.DE
Tue Jan 8 04:23:56 CET 2008
* Wolfgang Barth <wob at swobspace.de> [2007-12-28 14:05]:
> I'm using check_icmp from nagios-plugins-1.4.11. check_host is a symlink to
> check_icmp.
>
> time ./check_icmp -H 172.17.129.3
> CRITICAL - 172.17.129.3: rta nan, lost 100%|\
> rta=0.000ms;200.000;500.000;0; pl=100%;40;80;;
> 0.000u 0.000s 0:03.62 0.0% 0+0k 0+0io 0pf+0w
> ^^^^^^ OK
>
> time ./check_host -H 172.17.129.3
> CRITICAL - 172.17.129.3: rta nan, lost 100%|\
> rta=0.000ms;1000.000;1000.000;0; pl=100%;100;100;;
> 0.000u 0.000s 0:10.00 0.0% 0+0k 0+0io 0pf+0w
> ^^^^^^ BAD
>
> The host does not exist. round about 3 seconds after the first icmp packet
> a router answers with ICMP host unreachable. check_icmp aborts, check_host
> not.
>
> In earlier versions of Andreas Ericsson's check_icmp the goal of check_host
> was to abort immediately after such an ICMP host unreachable:
Actually, the detection of destination unreachable messages was broken,
no matter whether "check_icmp" or "check_host" was called. This is
fixed in SVN now. Thanks a lot for the report!
The reason for the difference you showed above is not that "check_icmp"
detected the destination unreachable message correctly, it simply gave
up much earlier than "check_host". While "check_host" tries to exit as
fast as possible as soon as it received a response, it also waits much
longer for the first response by setting different default values for
"-c" and "i". That is, if you call
$ check_host -c 500 -i 80 -H 172.17.129.3
it'll be as fast as "check_icmp", whereas
$ check_icmp -c 1000 -i 1000 -H 172.17.129.3
will be as slow as "check_host".
This difference will still show up with the fixed version in SVN in case
you don't get any response for a host check (because it's dropped by a
packet filter or whatever). The check_icmp revision which intruced the
"check_host" feature in our CVS (1.5, 2005/02/01) behaved like this
already, though. The idea seems to be something like "try really hard
to get a response, but exit immediately if we got one". The comment on
"check_host" in the source says:
| MODE_HOSTCHECK: Return immediately upon any sign of life. In
| addition, sends packets to ALL addresses assigned to this host (as
| returned by gethostbyname() or gethostbyaddr()) and expects one host
| only to be checked at a time. Therefore, any packet response what so
| ever will count as a sign of life, even when received outside crit.rta
| limit. Do not misspell any additional IP's.
Holger
More information about the Help
mailing list