[Nagiosplug-help] Nagios crashes badly and takes out both machines!

Pedro Silva psilva at dcc.online.pt
Thu Apr 29 00:52:07 CEST 2004


Try running the machines without Nagios if everything goes well then
Nagios is the responsible for that crash. Otherwise something else is
making the machines go down.

After The test without Nagios, if thew machines won't crash you should
try to disable some plugins to see if they are responsible for those
problems.

Pedro

> Hi there
> 
> I'm looking for some help.
> 
> Over the last week my mail server and the machine monitoring it with 
> Nagios has crashed 3 times at the same time.
> 
> I'm not sure if it is the Nagios machine crashing and taking my mail 
> server with it somehow or the other way around.
> 
> In both situations i have seen increased load on my mail server, to the 
> point of nrpe sending me a socket timeout warning. Shortly after this 
> the machines become unusable and a hard-reboot is the only way to fix it.
> 
> When both machines crash (mailserver=Redhat 9, nagio=fedora), i've gone 
> to the console on both machines and they are both filled with messages 
> saying "status=0". This is on BOTH machines.
> 
> I'm running nrpe on the mailserver checking load, number of processes, 
> disk space etc. The only anamolous thing is that i run my own plugin 
> which i called check_ps which scans 'ps' for a given process (just so i 
> know postfix is actually running!).
> 
> I was wondering if anyone could confirm whether or not it is Nagios that 
> is crashing my machines???
> 
> Kind Regards
> 
> Jon





More information about the Help mailing list