[Nagiosplug-help] load plugin
Dan Stromberg
strombrg at dcs.nac.uci.edu
Wed Feb 19 11:52:02 CET 2003
I'm pretty impressed by the speed and thoroughness of the answers on
this list.
I was assuming that check_load would talk to rpc.rstatd to get its
information. My mistake.
I've attached a python script that popen's rup to get load data from
rpc.rstatd. It has local policy encoded in it, but if someone were to
parameterize it a bit, it could be added to the collection of plugins, I
suppose. A really nice one would probably be written in C though, so it
doesn't have to popen something else or fire up an interpreter.
I have some simple plugins for checking nfs and nis and legato's
nsrexecd, if anyone's interested in seeing those. Oh, and three kind of
primitive ones for ssl imap, ssl pop and ssl http. I believe these are
all done in shell.
BTW, does nagios automatically kill off plugins that hang around too
long? I've seen tcp-utilizing processes get stuck for days sometimes on
down hosts.
On Wed, 2003-02-19 at 10:23, Russell Scibetti wrote:
> Are you just calling the check_load plugin as it is to check the load on
> remote servers? check_load must be run locally on the box you are
> checking. If you aren't doing this, then you are actually just getting
> the load on your local monitoring machine for every check_load call
> (would explain them being identical!).
>
> To use check_load for remote servers, you need to install the plugins
> physically on the remote machines and then either use:
>
> check_by_ssh - to ssh to the remote box, execute the plugin and return
> the result
>
> check_nrpe - Also install nrpe (nagios remote plugin executor) on the
> remote machines and configure it to run the load check. You would then
> use the check_nrpe plugin from the monitoring server (read the docs for
> more on this)
>
> nagios-statd - A daemon that runs on each remote machine that you query
> from the monitoring box to get the health of the remote machine (load,
> disk, etc). I don't know much on this option
>
> Hope this helps.
>
> Russell Scibetti
>
> Dan Stromberg wrote:
>
> >I set up the load plugin for a few of my machines. Two are suns, one is
> >linux. If it works out well, I'd like to get it set up for many more
> >hosts.
> >
> >One of the suns is a critical server I look after for a client. The
> >plugin has been reporting some astonishingly high load average numbers
> >for this machine. We've been trying to convince the client for some
> >time to upgrade this hardware. If we can trust these numbers, this
> >might be the information we need to get the upgrade.
> >
> >However, another one of the machines I set up with the load plugin, is a
> >redhat box. It didn't have rstatd configured, but the load plugin kept
> >giving fairly high numbers for it too. Once, nagios gave two
> >consecutive criticals with identical numbers for the sun and the redhat
> >box, even though I had neglected to configure rstatd on the redhat box.
> >
> >How trustworthy is the load plugin? Is it, or perhaps the nagios
> >framework, sometimes using uninitialized memory? Why didn't it give an
> >error about not being able to contact my rstatd?
> >
> >TIA.
> >
> >
> >
> >-------------------------------------------------------
> >This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
> >The most comprehensive and flexible code editor you can use.
> >Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
> >www.slickedit.com/sourceforge
> >_______________________________________________
> >Nagiosplug-help mailing list
> >Nagiosplug-help at lists.sourceforge.net
> >https://lists.sourceforge.net/lists/listinfo/nagiosplug-help
> >::: Please include plugins version (-v) and OS when reporting any issue.
> >::: Messages without supporting info will risk being sent to /dev/null
> >
> >
-------------- next part --------------
#!/dcs/bin/python
import sys
import os
import re
import string
host=sys.argv[1]
pipe=os.popen('/dcs/etc/rup '+host+' 2>&1','r')
line = pipe.readline()
#meter.eng up 154 days, 4:40, load average: 2.73 4.21 4.25
r = re.compile('^.*load average: ([0-9\.]*) ([0-9\.]*) ([0-9\.]*).*$')
m = r.match(line)
if not m:
print 'service unavailable'
sys.exit(2)
one_min = string.atof(m.group(1))
five_min = string.atof(m.group(2))
if one_min > 16.0 or five_min > 12.0:
print 'load critical:',m.group(1), m.group(2), m.group(3)
sys.exit(2)
if one_min > 12.0 or five_min > 8.0:
print 'load warning:',m.group(1), m.group(2), m.group(3)
sys.exit(1)
else:
print 'load:',m.group(1), m.group(2), m.group(3)
sys.exit(0)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <https://www.monitoring-plugins.org/archive/help/attachments/20030219/207abdbc/attachment.sig>
More information about the Help
mailing list