[Nagiosplug-help] nrpe spams my logs
Andreas Ericsson
ae at op5.se
Sun Nov 13 06:24:48 CET 2005
Jeroen Demeyer wrote:
> Hello list,
>
> We installed nagios+nrpe on a cluster to monitor the health of our
> nodes. Nagios runs on a server, and the diskless nodes run nrpe.
> This works fine, however, sometimes nrpe starts spamming the syslog:
>
> Nov 13 13:46:51 node6 nrpe[10671]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10673]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10675]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10677]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10679]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10681]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10683]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10685]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10687]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10689]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10691]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10693]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10695]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10697]: Network server accept failure (22: Invalid argument)
> Nov 13 13:46:51 node6 nrpe[10699]: Network server accept failure (22: Invalid argument)
> (this goes on for a long time, >100 times per second)
>
> I suppose this is a bug in nrpe?
Yes and no. It shouldn't retry the accept(2) syscall so often if it
fails with this frequency. The problem is however elsewhere, since it
somewhere fails to obtain a socket (or has its socket destroyed by the
kernel somehow) so that when it calls accept(2) on the socket it's not a
socket any more.
Hope that made sense. It did in my head, but doesn't look so now it's
down in writing.
> Also, I shut down nagios on the server
> but nrpe keeps giving these error messages, even hours after nagios was
> stopped. Any ideas how to fix this problem?
>
Run nrpe in an strace and tee the output to some file. If you manage to
get this error while running in the trace, send me the output (gzipped,
preferrably) and I'll see what goes on.
> This is on a recent Gentoo system with nagios-1.2 and nagios-nrpe-2.0-r1,
> 2.4.26 openMosix kernel.
>
You could try installing nrpe 2.2. It's available at
http://oss.op5.se/nagios/nrpe-2.2.tar.gz
Let me know if it fixes this particular problem for you.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
More information about the Help
mailing list