[Nagiosplug-checkins] CVS: nagiosplug/doc README,NONE,1.1 developer-guidelines.html,NONE,1.1 developer-guidelines.sgml,NONE,1.1
Subhendu Ghosh
sghosh at users.sourceforge.net
Sun May 26 19:06:02 CEST 2002
Update of /cvsroot/nagiosplug/nagiosplug/doc
In directory usw-pr-cvs1:/tmp/cvs-serv21569/doc
Added Files:
README developer-guidelines.html developer-guidelines.sgml
Log Message:
added developer guidelines.
--- NEW FILE ---
The developer documentation here is generated from the DocBook format.
--- NEW FILE ---
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<HTML
><HEAD
><TITLE
>Nagios plug-in development guidelines</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.64
"></HEAD
><BODY
CLASS="BOOK"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="BOOK"
><A
NAME="AEN1"
></A
><DIV
CLASS="TITLEPAGE"
><H1
CLASS="TITLE"
><A
NAME="AEN3"
>Nagios plug-in development guidelines</A
></H1
><H3
CLASS="AUTHOR"
><A
NAME="AEN5"
>Karl DeBisschop</A
></H3
><DIV
CLASS="AFFILIATION"
><DIV
CLASS="ADDRESS"
><P
CLASS="ADDRESS"
>karl at debisschop.net</P
></DIV
></DIV
><H3
CLASS="AUTHOR"
><A
NAME="AEN11"
>Ethan Galstad</A
></H3
><DIV
CLASS="AFFILIATION"
><DIV
CLASS="ADDRESS"
><P
CLASS="ADDRESS"
>netsaint at linuxbox.com</P
></DIV
></DIV
><H3
CLASS="AUTHOR"
><A
NAME="AEN21"
>Hugo Gayosso</A
></H3
><DIV
CLASS="AFFILIATION"
><DIV
CLASS="ADDRESS"
><P
CLASS="ADDRESS"
>hgayosso at gnu.org</P
></DIV
></DIV
><H3
CLASS="AUTHOR"
><A
NAME="AEN27"
>Subhendu Ghosh</A
></H3
><DIV
CLASS="AFFILIATION"
><DIV
CLASS="ADDRESS"
><P
CLASS="ADDRESS"
>sghosh at sourceforge.net</P
></DIV
></DIV
><H3
CLASS="AUTHOR"
><A
NAME="AEN33"
>Stanley Hopcroft</A
></H3
><DIV
CLASS="AFFILIATION"
><DIV
CLASS="ADDRESS"
><P
CLASS="ADDRESS"
>stanleyhopcroft at sourceforge.net</P
></DIV
></DIV
><P
CLASS="COPYRIGHT"
>Copyright © 2000 2001 2002 by Karl DeBisschop, Ethan Galstad,
Hugo Gayosso, Stanley Hopcroft, Subhendu Ghosh</P
><HR></DIV
><DIV
CLASS="TOC"
><DL
><DT
><B
>Table of Contents</B
></DT
><DT
><A
HREF="#PREFACE"
>About the guidelines</A
></DT
><DD
><DL
><DT
><A
HREF="#AEN51"
>Copyright</A
></DT
></DL
></DD
><DT
><A
HREF="#AEN56"
></A
></DT
><DD
><DL
><DT
><A
HREF="#PLUGOUTPUT"
>Plugin Output for Nagios</A
></DT
><DD
><DL
><DT
><A
HREF="#AEN60"
>Print only one line of text</A
></DT
><DT
><A
HREF="#AEN63"
>Screen Output</A
></DT
><DT
><A
HREF="#AEN67"
>Return the proper status code</A
></DT
><DT
><A
HREF="#AEN71"
>Plugin Return Codes</A
></DT
></DL
></DD
><DT
><A
HREF="#SYSCMDAUXFILES"
>System Commands and Auxiliary Files</A
></DT
><DD
><DL
><DT
><A
HREF="#AEN117"
>Don't execute system commands without specifying their
full path</A
></DT
><DT
><A
HREF="#AEN121"
>Use spopen() if external commands must be executed</A
></DT
><DT
><A
HREF="#AEN125"
>Don't make temp files unless absolutely required</A
></DT
><DT
><A
HREF="#AEN128"
>Don't be tricked into following symlinks</A
></DT
><DT
><A
HREF="#AEN131"
>Validate all input</A
></DT
></DL
></DD
><DT
><A
HREF="#PERLPLUGIN"
>Perl Plugins</A
></DT
><DT
><A
HREF="#RUNTIME"
>Runtime Timeouts</A
></DT
><DD
><DL
><DT
><A
HREF="#AEN165"
>Use DEFAULT_SOCKET_TIMEOUT</A
></DT
><DT
><A
HREF="#AEN168"
>Add alarms to network plugins</A
></DT
></DL
></DD
><DT
><A
HREF="#PLUGOPTIONS"
>Plugin Options</A
></DT
><DD
><DL
><DT
><A
HREF="#AEN174"
>Option Processing</A
></DT
><DT
><A
HREF="#AEN187"
>Plugins with more than one type of threshold, or with
threshold ranges</A
></DT
></DL
></DD
><DT
><A
HREF="#SUBMITTINGCHANGES"
>New submissions and patches</A
></DT
></DL
></DD
></DL
></DIV
><DIV
CLASS="PREFACE"
><HR><H1
><A
NAME="PREFACE"
>About the guidelines</A
></H1
><P
>The purpose of this guidelines is to provide a reference for
the plug-in developers and encourage the standarization of the
different kind of plug-ins: C, shell, perl, python, etc.</P
><DIV
CLASS="SECTION"
><HR><H1
CLASS="SECTION"
><A
NAME="AEN51"
>Copyright</A
></H1
><P
>Nagios Plug-in Development Guidelines Copyright (C) 2000 2001
2002
Karl DeBisschop, Ethan Galstad, Hugo Gayosso, Stanley Hopcroft,
Subhendu Ghosh</P
><P
>Permission is granted to make and distribute verbatim
copies of this manual provided the copyright notice and this
permission notice are preserved on all copies.</P
><P
>The plugins themselves are copyrighted by their respective
authors.</P
></DIV
></DIV
><DIV
CLASS="ARTICLE"
><DIV
CLASS="TOC"
><DL
><DT
><B
>Table of Contents</B
></DT
><DT
><A
HREF="#PLUGOUTPUT"
>Plugin Output for Nagios</A
></DT
><DT
><A
HREF="#SYSCMDAUXFILES"
>System Commands and Auxiliary Files</A
></DT
><DT
><A
HREF="#PERLPLUGIN"
>Perl Plugins</A
></DT
><DT
><A
HREF="#RUNTIME"
>Runtime Timeouts</A
></DT
><DT
><A
HREF="#PLUGOPTIONS"
>Plugin Options</A
></DT
><DT
><A
HREF="#SUBMITTINGCHANGES"
>New submissions and patches</A
></DT
></DL
></DIV
><DIV
CLASS="SECTION"
><H1
CLASS="SECTION"
><A
NAME="PLUGOUTPUT"
>Plugin Output for Nagios</A
></H1
><P
>You should always print something to STDOUT that tells if the
service is working or why its failing. Try to keep the output short -
probably less that 80 characters. Remember that you ideally would like
the entire output to appear in a pager message, which will get chopped
off after a certain length.</P
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN60"
>Print only one line of text</A
></H2
><P
>Nagios will only grab the first line of text from STDOUT
when it notifies contacts about potential problems. If you print
multiple lines, you're out of luck. Remember, keep it short and
to the point.</P
></DIV
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN63"
>Screen Output</A
></H2
><P
>The plug-in should print the diagnostic and just the
synopsis part of the help message. A well written plugin would
then have --help as a way to get the verbose help.</P
><P
>Code and output should try to respect the 80x25 size of a
crt (remember when fixing stuff in the server room!)</P
></DIV
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN67"
>Return the proper status code</A
></H2
><P
>See <A
HREF="#RETURNCODES"
>Table 1 in the section called <I
>Plugin Return Codes</I
></A
> below
for the numeric values of status codes and their
description. Remember to return an UNKNOWN state if bogus or
invalid command line arguments are supplied or it you are unable
to check the service.</P
></DIV
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN71"
>Plugin Return Codes</A
></H2
><P
>The return codes below are based on the POSIX spec of returning
a positive value. Netsaint prior to v0.0.7 supported non-POSIX
compliant return code of "-1" for unknown. Nagios supports POSIX return
codes by default.</P
><P
>Note: Some plugins will on occasion print on STDOUT that an error
occurred and error code is 138 or 255 or some such number. These
are usually caused by plugins using system commands and having not
enough checks to catch unexpected output. Developers should include a
default catch-all for system command output that returns an UNKOWN
return code.</P
><DIV
CLASS="TABLE"
><A
NAME="RETURNCODES"
></A
><P
><B
>Table 1. Plugin Return Codes</B
></P
><TABLE
BORDER="1"
BGCOLOR="#E0E0E0"
CELLSPACING="0"
CELLPADDING="4"
CLASS="CALSTABLE"
><THEAD
><TR
><TH
ALIGN="LEFT"
VALIGN="TOP"
><P
>Numeric Value</P
></TH
><TH
ALIGN="LEFT"
VALIGN="TOP"
><P
>Service Status</P
></TH
><TH
ALIGN="LEFT"
VALIGN="TOP"
><P
>Status Description</P
></TH
></TR
></THEAD
><TBODY
><TR
><TD
ALIGN="CENTER"
VALIGN="TOP"
><P
>0</P
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><P
>OK</P
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>The plugin was able to check the service and it
appeared to be functioning properly</P
></TD
></TR
><TR
><TD
ALIGN="CENTER"
VALIGN="TOP"
><P
>1</P
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><P
>Warning</P
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>The plugin was able to check the service, but it
appeared to be above some "warning" threshold or did not appear
to be working properly</P
></TD
></TR
><TR
><TD
ALIGN="CENTER"
VALIGN="TOP"
><P
>2</P
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><P
>Critical</P
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>The plugin detected that either the service was not
running or it was above some "critical" threshold</P
></TD
></TR
><TR
><TD
ALIGN="CENTER"
VALIGN="TOP"
><P
>3</P
></TD
><TD
ALIGN="LEFT"
VALIGN="MIDDLE"
><P
>Unknown</P
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
>Invalid command line arguments were supplied to the
plugin or the plugin was unable to check the status of the given
hosts/service</P
></TD
></TR
></TBODY
></TABLE
></DIV
></DIV
></DIV
><DIV
CLASS="SECTION"
><HR><H1
CLASS="SECTION"
><A
NAME="SYSCMDAUXFILES"
>System Commands and Auxiliary Files</A
></H1
><DIV
CLASS="SECTION"
><H2
CLASS="SECTION"
><A
NAME="AEN117"
>Don't execute system commands without specifying their
full path</A
></H2
><P
>Don't use exec(), popen(), etc. to execute external
commands without explicity using the full path of the external
program.</P
><P
>Doing otherwise makes the plugin vulnerable to hijacking
by a trojan horse earlier in the search path. See the main
plugin distribution for examples on how this is done.</P
></DIV
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN121"
>Use spopen() if external commands must be executed</A
></H2
><P
>If you have to execute external commands from within your
plugin and you're writing it in C, use the spopen() function
that Karl DeBisschop has written.</P
><P
>The code for spopen() and spclose() is included with the
core plugin distribution.</P
></DIV
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN125"
>Don't make temp files unless absolutely required</A
></H2
><P
>If temp files are needed, make sure that the plugin will
fail cleanly if the file can't be written (e.g., too few file
handles, out of disk space, incorrect permissions, etc.) and
delete the temp file when processing is complete.</P
></DIV
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN128"
>Don't be tricked into following symlinks</A
></H2
><P
>If your plugin opens any files, take steps to ensure that
you are not following a symlink to another location on the
system.</P
></DIV
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN131"
>Validate all input</A
></H2
><P
>use routines in utils.c or utils.pm and write more as needed</P
></DIV
></DIV
><DIV
CLASS="SECTION"
><HR><H1
CLASS="SECTION"
><A
NAME="PERLPLUGIN"
>Perl Plugins</A
></H1
><P
>Perl plugins are coded a little more defensively than other
plugins because of embedded Perl. When configured as such, embedded
Perl Nagios (ePN) requires stricter use of the some of Perl's features.
This section outlines some of the steps needed to use ePN
effectively.</P
><P
></P
><OL
TYPE="1"
><LI
><P
> Do not use BEGIN and END blocks since they will be called
the first time and when Nagios shuts down with Embedded Perl (ePN). In
particular, do not use BEGIN blocks to initialize variables.</P
></LI
><LI
><P
>To use utils.pm, you need to provide a full path to the
module in order for it to work with ePN.</P
><P
CLASS="LITERALLAYOUT"
> e.g.<br>
use lib "/usr/local/nagios/libexec";<br>
use utils qw(...);<br>
</P
></LI
><LI
><P
>Perl scripts should be called with "-w"</P
></LI
><LI
><P
>All Perl plugins must compile cleanly under "use strict" - i.e. at
least explicitly package names as in "$main::x" or predeclare every
variable. </P
><P
>Explicitly initialize each varialable in use. Otherwise with
caching enabled, the plugin will not be recompilied each time, and
therefore Perl will not reinitialize all the variables. All old
variable values will still be in effect.</P
></LI
><LI
><P
>Do not use < DATA > (these simply do not compile under ePN).</P
></LI
><LI
><P
>Do not use named subroutines</P
></LI
><LI
><P
>If writing to a file (perhaps recording
performance data) explicitly close close it. The plugin never
calls <I
CLASS="EMPHASIS"
>exit</I
>; that is caught by
p1.pl, so output streams are never closed.</P
></LI
><LI
><P
>As in <A
HREF="#RUNTIME"
>the section called <I
>Runtime Timeouts</I
></A
> all plugins need
to monitor their runtime, specially if they are using network
resources. Use of the <I
CLASS="EMPHASIS"
>alarm</I
> is recommended.
Plugins may import a default time out ($TIMEOUT) from utils.pm.
</P
></LI
><LI
><P
>Perl plugins should import %ERRORS from utils.pm
and then "exit $ERRORS{'OK'}" rather than "exit 0"
</P
></LI
></OL
></DIV
><DIV
CLASS="SECTION"
><HR><H1
CLASS="SECTION"
><A
NAME="RUNTIME"
>Runtime Timeouts</A
></H1
><P
>Plugins have a very limited runtime - typically 10 sec.
As a result, it is very important for plugins to maintain internal
code to exit if runtime exceeds a threshold. </P
><P
>All plugins should timeout gracefully, not just networking
plugins. For instance, df may lock if you have automounted
drives and your network fails - but on first glance, who'd think
df could lock up like that. Plus, it should just be more error
resistant to be able to time out rather than consume
resources.</P
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN165"
>Use DEFAULT_SOCKET_TIMEOUT</A
></H2
><P
>All network plugins should use DEFAULT_SOCKET_TIMEOUT to timeout</P
></DIV
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN168"
>Add alarms to network plugins</A
></H2
><P
>If you write a plugin which communicates with another
networked host, you should make sure to set an alarm() in your
code that prevents the plugin from hanging due to abnormal
socket closures, etc. Nagios takes steps to protect itself
against unruly plugins that timeout, but any plugins you create
should be well behaved on their own.</P
></DIV
></DIV
><DIV
CLASS="SECTION"
><HR><H1
CLASS="SECTION"
><A
NAME="PLUGOPTIONS"
>Plugin Options</A
></H1
><P
>A well written plugin should have --help as a way to get
verbose help. Code and output should try to respect the 80x25 size of a
crt (remember when fixing stuff in the server room!)</P
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN174"
>Option Processing</A
></H2
><P
>For plugins written in C, we recommend the C standard
getopt library for short options. If using getopt_long, check to
be sure that HAVE_GETOPT_H is defined (configure checks this and
sets the #define in common/config.h).</P
><P
>For plugins written in Perl, we recommend Getopt::Long module.</P
><P
>Positional arguments are strongly discouraged.</P
><P
>There are a few reserved options that should not be used
for other purposes:</P
><P
CLASS="LITERALLAYOUT"
> -V version (--version)<br>
-h help (--help)<br>
-t timeout (--timeout)<br>
-w warning threshold (--warning)<br>
-c critical threshold (--critical)<br>
-H hostname (--hostname)<br>
</P
><P
>In addition to the reserved options above, some other standard options are:</P
><P
CLASS="LITERALLAYOUT"
> -C SNMP community (--community)<br>
-a authentication password (--authentication)<br>
-l login name (--logname)<br>
-p port or password (--port or --passwd/--password)monitors operational<br>
-u url or username (--url or --username)<br>
</P
><P
>Look at check_pgsql and check_procs to see how I currently
think this can work. Standard options are:</P
><P
>The option -V or --version should be present in all
plugins. For C plugins it should result in a call to print_revision, a
function in utils.c which takes two character arguments, the
command name and the plugin revision.</P
><P
>The -? option, or any other unparsable set of options,
should print out a short usage statement. Character width should
be 80 and less and no more that 23 lines should be printed (it
should display cleanly on a dumb terminal in a server
room).</P
><P
>The option -h or --help should be present in all plugins.
In C plugins, it should result in a call to print_help (or
equivalent). The function print_help should call print_revision,
then print_usage, then should provide detailed
help. Help text should fit on an 80-character width display, but
may run as many lines as needed.</P
></DIV
><DIV
CLASS="SECTION"
><HR><H2
CLASS="SECTION"
><A
NAME="AEN187"
>Plugins with more than one type of threshold, or with
threshold ranges</A
></H2
><P
>Old style was to do things like -ct for critical time and
-cv for critical value. That goes out the window with POSIX
getopt. The allowable alternatves are:</P
><P
></P
><OL
TYPE="1"
><LI
><P
>long options like -critical-time (or -ct and -cv, I
suppose).</P
></LI
><LI
><P
>repeated options like `check_load -w 10 -w 6 -w 4 -c
16 -c 10 -c 10`</P
></LI
><LI
><P
>for brevity, the above can be expressed as `check_load
-w 10,6,4 -c 16,10,10`</P
></LI
><LI
><P
>ranges are expressed with colons as in `check_procs -C
httpd -w 1:20 -c 1:30` which will warn above 20 instances,
and critical at 0 and above 30</P
></LI
><LI
><P
>lists are expressed with commas, so Jacob's check_nmap
uses constructs like '-p 1000,1010,1050:1060,2000'</P
></LI
><LI
><P
>If possible when writing lists, use tokens to make the
list easy to remember and non-order dependent - so
check_disk uses '-c 10000,10%' so that it is clear which is
the precentage and which is the KB values (note that due to
my own lack of foresight, that used to be '-c 10000:10%' but
such constructs should all be changed for consistency,
though providing reverse compatibility is fairly
easy).</P
></LI
></OL
><P
>As always, comments are welcome - making this consistent
without a host of long options was quite a hassle, and I would
suspect that there are flaws in this strategy. Perhaps clear
long-options is the most important of the above choices, but not
all POSIX systems have C libraries for long options, so the
short forms must exist as well.</P
></DIV
></DIV
><DIV
CLASS="SECTION"
><HR><H1
CLASS="SECTION"
><A
NAME="SUBMITTINGCHANGES"
>New submissions and patches</A
></H1
><P
>If you would like other to use your plugins and have it included in
the standard distribution, please include patches for the relavant
configuration files, in particular "configure.in" Otherwise submitted
plugins will be included in the contrib directory.</P
><P
>Plugins in the contrib directory are going to be migrated to the
standard plugins/plugin-scripts directory as time permits and per user
requests</P
><P
>Patches should be submitted via the SourceForge and be announced to
the mailing list.</P
><P
>For new plugins, provide a diff to add to the EXTRAS list (configure.in)
unless you are fairly sure that the plugin will work for all platforms with
no non-standard software added.</P
><P
>If possible please submit a test harness. Documentation on sample
tests coming soon.</P
></DIV
></DIV
></DIV
></BODY
></HTML
>
--- NEW FILE ---
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.1//EN">
<book>
<title>Nagios Plug-in Developer Guidelines</title>
<bookinfo>
<authorgroup>
<author>
<firstname>Karl</firstname>
<surname>DeBisschop</surname>
<affiliation>
<address><email>karl at debisschop.net</email></address>
</affiliation>
</author>
<author>
<firstname>Ethan</firstname>
<surname>Galstad</surname>
<authorblurb>
<para>Author of Nagios</para>
<para><ulink url="http://www.nagios.org"></ulink></para>
</authorblurb>
<affiliation>
<address><email>netsaint at linuxbox.com</email></address>
</affiliation>
</author>
<author>
<firstname>Hugo</firstname>
<surname>Gayosso</surname>
<affiliation>
<address><email>hgayosso at gnu.org</email></address>
</affiliation>
</author>
<author>
<firstname>Subhendu</firstname>
<surname>Ghosh</surname>
<affiliation>
<address><email>sghosh at sourceforge.net</email></address>
</affiliation>
</author>
<author>
<firstname>Stanley</firstname>
<surname>Hopcroft</surname>
<affiliation>
<address><email>stanleyhopcroft at sourceforge.net</email></address>
</affiliation>
</author>
</authorgroup>
<pubdate>2002</pubdate>
<title>Nagios plug-in development guidelines</title>
<revhistory>
<revision>
<revnumber>0.4</revnumber>
<date>2 May 2002</date>
</revision>
</revhistory>
<copyright>
<year>2000 2001 2002</year>
<holder>Karl DeBisschop, Ethan Galstad,
Hugo Gayosso, Stanley Hopcroft, Subhendu Ghosh</holder>
</copyright>
</bookinfo>
<preface id=preface>
<title>About the guidelines</title>
<para>The purpose of this guidelines is to provide a reference for
the plug-in developers and encourage the standarization of the
different kind of plug-ins: C, shell, perl, python, etc.</para>
<section> <title>Copyright</title>
<para>Nagios Plug-in Development Guidelines Copyright (C) 2000 2001
2002
Karl DeBisschop, Ethan Galstad, Hugo Gayosso, Stanley Hopcroft,
Subhendu Ghosh</para>
<para>Permission is granted to make and distribute verbatim
copies of this manual provided the copyright notice and this
permission notice are preserved on all copies.</para>
<para>The plugins themselves are copyrighted by their respective
authors.</para>
</section>
</preface>
<article>
<section id="PlugOutput"><title>Plugin Output for Nagios</title>
<para>You should always print something to STDOUT that tells if the
service is working or why its failing. Try to keep the output short -
probably less that 80 characters. Remember that you ideally would like
the entire output to appear in a pager message, which will get chopped
off after a certain length.</para>
<section><title>Print only one line of text</title>
<para>Nagios will only grab the first line of text from STDOUT
when it notifies contacts about potential problems. If you print
multiple lines, you're out of luck. Remember, keep it short and
to the point.</para>
</section>
<section><title>Screen Output</title>
<para>The plug-in should print the diagnostic and just the
synopsis part of the help message. A well written plugin would
then have --help as a way to get the verbose help.</para>
<para>Code and output should try to respect the 80x25 size of a
crt (remember when fixing stuff in the server room!)</para>
</section>
<section><title>Return the proper status code</title>
<para>See <xref linkend="ReturnCodes"> below
for the numeric values of status codes and their
description. Remember to return an UNKNOWN state if bogus or
invalid command line arguments are supplied or it you are unable
to check the service.</para>
</section>
<section><title>Plugin Return Codes</title>
<para>The return codes below are based on the POSIX spec of returning
a positive value. Netsaint prior to v0.0.7 supported non-POSIX
compliant return code of "-1" for unknown. Nagios supports POSIX return
codes by default.</para>
<para>Note: Some plugins will on occasion print on STDOUT that an error
occurred and error code is 138 or 255 or some such number. These
are usually caused by plugins using system commands and having not
enough checks to catch unexpected output. Developers should include a
default catch-all for system command output that returns an UNKOWN
return code.</para>
<table id="ReturnCodes"><title>Plugin Return Codes</title>
<tgroup cols="3">
<thead>
<row>
<entry><para>Numeric Value</para></entry>
<entry><para>Service Status</para></entry>
<entry><para>Status Description</para></entry>
</row>
</thead>
<tbody>
<row>
<entry align=center><para>0</para></entry>
<entry valign=middle><para>OK</para></entry>
<entry><para>The plugin was able to check the service and it
appeared to be functioning properly</para></entry>
</row>
<row>
<entry align=center><para>1</para></entry>
<entry valign=middle><para>Warning</para></entry>
<entry><para>The plugin was able to check the service, but it
appeared to be above some "warning" threshold or did not appear
to be working properly</para></entry>
</row>
<row>
<entry align=center><para>2</para></entry>
<entry valign=middle><para>Critical</para></entry>
<entry><para>The plugin detected that either the service was not
running or it was above some "critical" threshold</para></entry>
</row>
<row>
<entry align=center><para>3</para></entry>
<entry valign=middle><para>Unknown</para></entry>
<entry><para>Invalid command line arguments were supplied to the
plugin or the plugin was unable to check the status of the given
hosts/service</para></entry>
</row>
</tbody>
</tgroup>
</table>
</section>
</section>
<section id="SysCmdAuxFiles"><title>System Commands and Auxiliary Files</title>
<section><title>Don't execute system commands without specifying their
full path</title>
<para>Don't use exec(), popen(), etc. to execute external
commands without explicity using the full path of the external
program.</para>
<para>Doing otherwise makes the plugin vulnerable to hijacking
by a trojan horse earlier in the search path. See the main
plugin distribution for examples on how this is done.</para>
</section>
<section><title>Use spopen() if external commands must be executed</title>
<para>If you have to execute external commands from within your
plugin and you're writing it in C, use the spopen() function
that Karl DeBisschop has written.</para>
<para>The code for spopen() and spclose() is included with the
core plugin distribution.</para>
</section>
<section><title>Don't make temp files unless absolutely required</title>
<para>If temp files are needed, make sure that the plugin will
fail cleanly if the file can't be written (e.g., too few file
handles, out of disk space, incorrect permissions, etc.) and
delete the temp file when processing is complete.</para>
</section>
<section><title>Don't be tricked into following symlinks</title>
<para>If your plugin opens any files, take steps to ensure that
you are not following a symlink to another location on the
system.</para>
</section>
<section><title>Validate all input</title>
<para>use routines in utils.c or utils.pm and write more as needed</para>
</section>
</section>
<section id="PerlPlugin"><title>Perl Plugins</title>
<para>Perl plugins are coded a little more defensively than other
plugins because of embedded Perl. When configured as such, embedded
Perl Nagios (ePN) requires stricter use of the some of Perl's features.
This section outlines some of the steps needed to use ePN
effectively.</para>
<orderedlist>
<listitem><para> Do not use BEGIN and END blocks since they will be called
the first time and when Nagios shuts down with Embedded Perl (ePN). In
particular, do not use BEGIN blocks to initialize variables.</para>
</listitem>
<listitem><para>To use utils.pm, you need to provide a full path to the
module in order for it to work with ePN.</para>
<literallayout>
e.g.
use lib "/usr/local/nagios/libexec";
use utils qw(...);
</literallayout>
</listitem>
<listitem><para>Perl scripts should be called with "-w"</para>
</listitem>
<listitem><para>All Perl plugins must compile cleanly under "use strict" - i.e. at
least explicitly package names as in "$main::x" or predeclare every
variable. </para>
<para>Explicitly initialize each varialable in use. Otherwise with
caching enabled, the plugin will not be recompilied each time, and
therefore Perl will not reinitialize all the variables. All old
variable values will still be in effect.</para>
</listitem>
<listitem><para>Do not use < DATA > (these simply do not compile under ePN).</para>
</listitem>
<listitem><para>Do not use named subroutines</para>
</listitem>
<listitem><para>If writing to a file (perhaps recording
performance data) explicitly close close it. The plugin never
calls <emphasis role=strong>exit</emphasis>; that is caught by
p1.pl, so output streams are never closed.</para>
</listitem>
<listitem><para>As in <xref linkend="runtime"> all plugins need
to monitor their runtime, specially if they are using network
resources. Use of the <emphasis>alarm</emphasis> is recommended.
Plugins may import a default time out ($TIMEOUT) from utils.pm.
</para>
</listitem>
<listitem><para>Perl plugins should import %ERRORS from utils.pm
and then "exit $ERRORS{'OK'}" rather than "exit 0"
</para>
</listitem>
</orderedlist>
</section>
<section id="runtime"><title>Runtime Timeouts</title>
<para>Plugins have a very limited runtime - typically 10 sec.
As a result, it is very important for plugins to maintain internal
code to exit if runtime exceeds a threshold. </para>
<para>All plugins should timeout gracefully, not just networking
plugins. For instance, df may lock if you have automounted
drives and your network fails - but on first glance, who'd think
df could lock up like that. Plus, it should just be more error
resistant to be able to time out rather than consume
resources.</para>
<section><title>Use DEFAULT_SOCKET_TIMEOUT</title>
<para>All network plugins should use DEFAULT_SOCKET_TIMEOUT to timeout</para>
</section>
<section><title>Add alarms to network plugins</title>
<para>If you write a plugin which communicates with another
networked host, you should make sure to set an alarm() in your
code that prevents the plugin from hanging due to abnormal
socket closures, etc. Nagios takes steps to protect itself
against unruly plugins that timeout, but any plugins you create
should be well behaved on their own.</para>
</section>
</section>
<section id="PlugOptions"><title>Plugin Options</title>
<para>A well written plugin should have --help as a way to get
verbose help. Code and output should try to respect the 80x25 size of a
crt (remember when fixing stuff in the server room!)</para>
<section><title>Option Processing</title>
<para>For plugins written in C, we recommend the C standard
getopt library for short options. If using getopt_long, check to
be sure that HAVE_GETOPT_H is defined (configure checks this and
sets the #define in common/config.h).</para>
<para>For plugins written in Perl, we recommend Getopt::Long module.</para>
<para>Positional arguments are strongly discouraged.</para>
<para>There are a few reserved options that should not be used
for other purposes:</para>
<literallayout>
-V version (--version)
-h help (--help)
-t timeout (--timeout)
-w warning threshold (--warning)
-c critical threshold (--critical)
-H hostname (--hostname)
</literallayout>
<para>In addition to the reserved options above, some other standard options are:</para>
<literallayout>
-C SNMP community (--community)
-a authentication password (--authentication)
-l login name (--logname)
-p port or password (--port or --passwd/--password)monitors operational
-u url or username (--url or --username)
</literallayout>
<para>Look at check_pgsql and check_procs to see how I currently
think this can work. Standard options are:</para>
<para>The option -V or --version should be present in all
plugins. For C plugins it should result in a call to print_revision, a
function in utils.c which takes two character arguments, the
command name and the plugin revision.</para>
<para>The -? option, or any other unparsable set of options,
should print out a short usage statement. Character width should
be 80 and less and no more that 23 lines should be printed (it
should display cleanly on a dumb terminal in a server
room).</para>
<para>The option -h or --help should be present in all plugins.
In C plugins, it should result in a call to print_help (or
equivalent). The function print_help should call print_revision,
then print_usage, then should provide detailed
help. Help text should fit on an 80-character width display, but
may run as many lines as needed.</para>
</section>
<section>
<title>Plugins with more than one type of threshold, or with
threshold ranges</title>
<para>Old style was to do things like -ct for critical time and
-cv for critical value. That goes out the window with POSIX
getopt. The allowable alternatves are:</para>
<orderedlist>
<listitem>
<para>long options like -critical-time (or -ct and -cv, I
suppose).</para>
</listitem>
<listitem>
<para>repeated options like `check_load -w 10 -w 6 -w 4 -c
16 -c 10 -c 10`</para>
</listitem>
<listitem>
<para>for brevity, the above can be expressed as `check_load
-w 10,6,4 -c 16,10,10`</para>
</listitem>
<listitem>
<para>ranges are expressed with colons as in `check_procs -C
httpd -w 1:20 -c 1:30` which will warn above 20 instances,
and critical at 0 and above 30</para>
</listitem>
<listitem>
<para>lists are expressed with commas, so Jacob's check_nmap
uses constructs like '-p 1000,1010,1050:1060,2000'</para>
</listitem>
<listitem>
<para>If possible when writing lists, use tokens to make the
list easy to remember and non-order dependent - so
check_disk uses '-c 10000,10%' so that it is clear which is
the precentage and which is the KB values (note that due to
my own lack of foresight, that used to be '-c 10000:10%' but
such constructs should all be changed for consistency,
though providing reverse compatibility is fairly
easy).</para>
</listitem>
</orderedlist>
<para>As always, comments are welcome - making this consistent
without a host of long options was quite a hassle, and I would
suspect that there are flaws in this strategy. Perhaps clear
long-options is the most important of the above choices, but not
all POSIX systems have C libraries for long options, so the
short forms must exist as well.</para>
</section>
</section>
<section id="SubmittingChanges"><title>New submissions and patches</title>
<para>If you would like other to use your plugins and have it included in
the standard distribution, please include patches for the relavant
configuration files, in particular "configure.in" Otherwise submitted
plugins will be included in the contrib directory.</para>
<para>Plugins in the contrib directory are going to be migrated to the
standard plugins/plugin-scripts directory as time permits and per user
requests</para>
<para>Patches should be submitted via the SourceForge and be announced to
the mailing list.</para>
<para>For new plugins, provide a diff to add to the EXTRAS list (configure.in)
unless you are fairly sure that the plugin will work for all platforms with
no non-standard software added.</para>
<para>If possible please submit a test harness. Documentation on sample
tests coming soon.</para>
</section>
</article>
</book>
More information about the Commits
mailing list