[Nagiosplug-help] RE: [Nagios-users] *** RESOLVED***check_snmp CPU Load strange result
Pascal Wessel
pascal.wessel at media-online.ch
Tue Dec 3 23:36:03 CET 2002
Dear Subhendu,
Indeed the new beta2 plugin check_snmp is working as expected :-)
Now I have OK, WARNING and CRITICAL status, depending on the check I
perform.
Thanks a lot for your kind help, time, and advise.
Warm regards,
Pascal Wessel
-----Original Message-----
From: Subhendu Ghosh [mailto:sghosh at sghosh.org]
Sent: mardi, 3. décembre 2002 17:55
To: Pascal Wessel
Subject: RE: [Nagios-users] check_snmp CPU Load strange result
could you try the plugins from beta2 - a few changes were made on
check_snmp. (it will probably get re-writen for beta3)
-sg
On Tue, 3 Dec 2002, Pascal Wessel wrote:
> You asked: Please post the version of the plugin/os/net-snmp
>
> Hummm... Ok, I say it again: (I copy-past the needed info from my
> first mail. Btw it's at the very end of this thread I agree)
>
> ---snip---
> > My Nagios system installation is as follows:
> >
> > System Intel i686, Mandrake 9.0, Kernel 2.4.19-16
> > NAGIOS: Nagios 1.0b6
> > Plugins: nagios-plugins-200211131100
> > Check_snmp: Revision: 1.17
> > SNMP:
> > libsnmp0-4.2.3-4mdk
> > ucd-snmp-4.2.3-4mdk
> > ucd-snmp-utils-4.2.3-4mdk
> ---snip---
> I had this:
> [libexec]# ldd check_snmp
> libutil.so.1 => /lib/libutil.so.1 (0x40023000)
> libc.so.6 => /lib/i686/libc.so.6 (0x40026000)
> /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
>
> check_snmp treats bare numbers in warning and critical as upper
> bounds, so -w'60,69' is interpreted as warn on 0-60 for the first and
> 0-69 for the
>
> second...
>
> Very interesting... I did't catch it at first.
>
> I made some tests with different values for warning and/or critical
> thresholds... Very stange, I only have WARNING status returning from
> these ones. To clarify a bit I made tests with only one OID (The 1 min
> CPU load), with same result: WARNING Always returns. By reading
> check_snmp --help I saw that I can use absolute values (like -w
> '0:30') if colons are used.
> Ranges are inclusive and are indicated with colons. When specified
as
> 'min:max' a STATE_OK will be returned if the result is within the
> indicated
> range or is equal to the upper or lower bound. A non-OK state will
be
> returned if the result is outside the specified range.
>
> Then....some tests again....
> Eg (with the 1 min CPU load OID):
>
> ./check_snmp -v -t 10 -H 192.168.1.1 -o .1.3.6.1.4.1.9.2.1.57.0 -C
> publicro -w '50:69' -c'70:99'
> enterprises.9.2.1.57.0 = 4
>
> SNMP WARNING - *4*
> That's ok... 4% is not in the range 50:69... Then it's a "non-OK"
> result, thus the WARNING for the -w switch.
>
> ./check_snmp -v -t 10 -H 192.168.1.1 -o .1.3.6.1.4.1.9.2.1.57.0 -C
> publicro -w '1:9' -c'10:99'
> enterprises.9.2.1.57.0 = 7
>
> SNMP WARNING - *7*
> 7% is within the range 1:9... I should have a STATE_OK.
> GrrrRRRrrrRRRRRR %#*$& !!! Why do I have WARNING ?????
>
>
> Well, any help appreciated. Thanks in advance.
> Warm regards,
> Pascal
>
>
>
>
> -sg
>
> On Mon, 2 Dec 2002, Pascal Wessel wrote:
>
> > Nagios gives me warning when snmp_check 'ing for Cisco 3640 CPU load
> > /
>
> > IOS is (C3640-IK9O3S-M), Version 12.2(10a) but the CPU load is below
> > my Warning threshold.
> >
> > When launched from the command-line with verbose output:
> >
> > [libexec]# ./check_snmp -v -t 10 -H 192.168.1.1 -o
> > .1.3.6.1.4.1.9.2.1.57.0,.1.3.6.1.4.1.9.2.1.58.0 -C publicro -w
> > '60,69', -c '70,80' -l 'CPU usage 1min/5min' -D ' / '
> > /usr/bin/snmpget -m ALL -v 1 -c publicro 192.168.1.1:161
> > .1.3.6.1.4.1.9.2.1.57.0 .1.3.6.1.4.1.9.2.1.58.0
> > enterprises.9.2.1.57.0 = 4
> > enterprises.9.2.1.58.0 = 3
> >
> > CPU usage 1min/5min WARNING - *4* / *3*
> >
> > As you can see.. (and if I understood the syntax)
> > Warning status should be triggered when the CPU load is between 60
> > and
>
> > 69% Critical status should be triggered when the router CPU is
> > between
>
> > 70 to 80%
> >
> > #----
> > My question is: why this check reports WARNING as my router CPU load
> > (4% last minute and 3% last 5 min) is below the WARNING threshold ?
> > #----
> >
> > My Nagios system installation is as follows:
> >
> > System Intel i686, Mandrake 9.0, Kernel 2.4.19-16
> > NAGIOS: Nagios 1.0b6
> > Plugins: nagios-plugins-200211131100
> > Check_snmp: Revision: 1.17
> > SNMP:
> > libsnmp0-4.2.3-4mdk
> > ucd-snmp-4.2.3-4mdk
> > ucd-snmp-utils-4.2.3-4mdk
> >
> > Below a snip of my "cfg file
> >
> > #--- hosts.cfg for myrouter
> >
> > define host {
> > name generic-host
> > notifications_enabled 1 ; Host
> notifications
> > are enabled
> > event_handler_enabled 1 ; Host event
> handler is
> > enabled
> > flap_detection_enabled 1 ; Flap
> detection is
> > enabled
> > process_perf_data 1 ; Process
> performance
> > data
> > retain_status_information 1 ; Retain status
> > information across program restarts
> > retain_nonstatus_information 1 ; Retain
> non-status
> > information across program restarts
> > max_check_attempts 10
> > register 0 ; DONT REGISTER THIS
> > DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
> > }
> >
> > define host {
> > use generic-host
> ; Name
> > of host template to use
> > host_name myrouter
> > alias Router Gva Coulou -6
> > address 192.168.1.1
> > check_command check-host-alive
> > notification_interval 60
> > notification_period 24x7
> > notification_options d,u,r
> > }
> >
> > #--- services.cfg
> > define service {
> > name generic-service ;
> > active_checks_enabled 1 ; Active service checks
> are
> > enabled
> > passive_checks_enabled 1 ; Passive service
> checks are
> > enabled/accepted
> > parallelize_check 1 ; Active service checks
> should
> > be parallelized
> > obsess_over_service 1 ; We should obsess over
> this
> > service (if necessary)
> > check_freshness 0 ; Default is to NOT
> check
> > service 'freshness'
> > notifications_enabled 1 ; Service notifications
> are
> > enabled
> > event_handler_enabled 1 ; Service event handler
> is
> > enabled
> > flap_detection_enabled 1 ; Flap detection is
> enabled
> > process_perf_data 1 ; Process performance
> data
> > retain_status_information 1 ; Retain status
> information
> > across program restarts
> > retain_nonstatus_information 1 ; Retain non-status
> information
> > across program restarts
> > normal_check_interval 5
> > retry_check_interval 2
> > notification_period 24x7
> > notification_options u,c,r
> > register 0 ; DONT REGISTER
> THIS
> > DEFINITION
> > }
> >
> > define service{
> > use generic-service
> > host_name myrouter
> > service_description CPU
> > is_volatile 0
> > check_period 24x7
> > max_check_attempts 3
> > retry_check_interval 1
> > contact_groups router-admins
> > notification_interval 120
> > notification_period 24x7
> > check_command
> > check_cisco_cpu!publicro!60!69!70!80
> > }
> >
> >
> >
> > #--- checkcommands.cfg
> > # 'check_snmp' generic command definition
> > define command{
> > command_name check_snmp
> > command_line $USER1$/check_snmp -t 10 -H $HOSTADDRESS$ -C $ARG1$
> > $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$ $ARG7$ $ARG8$ $ARG9$
> > }
> > # check_cisco_cpu: checks router CPU-usage
> > # Syntax
> > !Hostname!Community!WARN-1min-%!WARN-5min-%!CRIT-1min-%!CRIT-5min-%
> > define command{
> > command_name check_cisco_cpu
> > command_line $USER1$/check_snmp -t 10 -H $HOSTADDRESS$
> > -o.1.3.6.1.4.1.9.2.1.57.0,.1.3.6.1.4.1.9.2.1.58.0 -C $ARG1$ -w
> > :$ARG2$,:$ARG3$ -c : $ARG4$,:$ARG5$ -l 'CPU usage 1min/5min' -D ' /
'
> > }
> >
> >
> >
> > Btw, by looking at the code in check_snmp.c I'm wondering . Is there
> > a
>
> > problem with : #define mark(a) ((a)!=0?"*":"") in check_snmp.c ???
> > Or
> > are my parms so bad ? :-o
> >
> > Thanks for your kind help.
> > Warm regards,
> > Pascal
> >
> >
> >
> > -------------------------------------------------------
> > This sf.net email is sponsored by:ThinkGeek
> > Welcome to geek heaven.
> > http://thinkgeek.com/sf
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> >
>
>
--
More information about the Help
mailing list