[Nagiosplug-devel] Working on testcases
Ton Voon
ton.voon at altinity.com
Mon Nov 14 02:33:19 CET 2005
On 13 Nov 2005, at 19:57, sean finney wrote:
> just to throw another $0.02 into the bucket...
Please! I value all your opinions!
> On Fri, Nov 11, 2005 at 12:51:48AM +0000, Ton Voon wrote:
>> "UNKNOWN is for invalid command args or any other failure before the
>> requested check can be performed - with the only exception being
>> hostname lookups which should return CRITICAL."
>
> given the example you listed below, i don't think this is a good idea.
> rather, i think something like:
>
> "UNKNOWN is for invalid command args or other failures preventing
> the plugin from performing the specified operation."
Yes, I think this is better. I'm keep thinking about "things in the
middle" that affect results. I think my facetious examples with
check_http muddied the waters a bit.
Just to make it clearer, how about amending to:
"UNKNOWN is for invalid command args, or failures in other systems
preventing the plugin from performing the specified operation"
So, other systems that could prevent the check include: internal
errors (unix level: malloc, fork, etc), networks, DNS and agents.
> about dns: i think there are two specific and very different kinds
> of failure. there is "general resolution failure", and there is a
> "host does not exist failure". i would say that the former ought
> to remain as an UNKNOWN, as it parallels similar failures in other
> system calls such as malloc. however, if the plugin gets a "no such
> host" response, then it definitely should be CRITICAL--as you could
> implicitly divine that the hostname is supposed to resolve.
> similarly,
> i feel that remote service check connection failures should remain
> CRITICAL.
Given the above definition, both failures should be UNKNOWN. I'm with
Andreas on this. But there's Sean and Ethan on CRITICAL. So the
voting currently stands at 2-2.
If we go with CRITICAL, then this needs to be marked as an exception
in the guidelines.
Andreas also says:
> check_nt, check_nwstat, check_nrpe and check_snmp should return
> UNKNOWN if it can't get the data it's requesting (as it's not the
> status of the agent that's being requested)
which also fits with this definition.
>> (2) check_http -H webserver -w 2
>>
>> This returns OK if can connect to webserver and returns data within 2
>> seconds. If it cannot connect, then this returns UNKNOWN because it
>> is not the metric that is being requested to check against (currently
>> returns CRITICAL).
>
> i'd say it should still return CRITICAL.
>
Yes, I'm clearly wrong. By the definition, it is not a failure "in
[an]other system", so UNKNOWN is the wrong state, so it must be
CRITICAL.
>> (3) check_http -H webserver -r 'string_to_find'
>>
>> This returns OK if it can find the server and return data with the
>> string. If it cannot connect to the server (currently CRITICAL), or
>> gets a 302 redirection (currently OK (?) ), this should be an
>> UNKNOWN.
>
> again, i think things such as "connection refused" should still result
> in states indicative of a problem.
Connection failure is critical. By the requested arguments, the
string_to_find is not found, so OK is wrong. But it is not a failure
in another system, so UNKNOWN is wrong too. So it must be CRITICAL
(or, I guess, configurable to WARNING).
> the big difference in my
> view is that some problems prevent the plugin from doing its job,
> while other problems signify that there really is a problem.
I think that difference is a lot clearer in your mind than it is in
mine :) I think the "other systems" helps me, which Sean's
intuitively picked up on.
> wrt the 302 redirections, i haven't even looked at what we're
> currently
> doing but feel we ought to follow the redirection (or provide
> a cmdline toggle) if we want to be good user-agents :)
The default is not to follow. "-f follow" in the case above will work.
Ton
http://www.altinity.com
T: +44 (0)870 787 9243
F: +44 (0)845 280 1725
Skype: tonvoon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-plugins.org/archive/devel/attachments/20051114/7fc481f1/attachment.html>
More information about the Devel
mailing list