[Nagiosplug-help] IIS Load and throughput
Anthony Montibello
amontibello at gmail.com
Fri Oct 5 18:47:05 CEST 2007
"The real source of the problem is Windows that cannot provide real
counters from which you can derive precise values by calculating the
time between two polls. Since Windows can't be fixed the only solution
left is a daemon that do the work for us."
I will stop responding with this message, since I think it may not benefit
the NAGIOS community anymore.
Windows does actually provide real counter (and the sample space beteen
polls can be varied) , But I agree the problem is at windows.
There is no good mechanism retrieving average values remotly without keeping
local cache of intermediate values.
for example if you want a 5 min average of a counter it MUST be prepared
ahead of time (windows cannot provide it in realtiume) otherwise the
monitoring server would have to wait 5 min (unacceptable)
Is anyone else interested in retrieving more accurate Averages of these
values, or is this a dead end point.
TOny (author of NC_NEt)
On 10/5/07, Thomas Guyot-Sionnest <dermoth at aei.ca> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 05/10/07 01:55 AM, Anthony Montibello wrote:
> > "The problem with them is that they cannot get
> > the average value for a real period of time. Also NC_Net doesn't handle
> > counters that go away momentarily (i.e. restarted service).
> > <http://sourceforge.net/projects/nc-net>"
> >
> > Did you test NC_Net v 4.1a for the counters going away issue? I re-coded
> > the Counter check such that it should be fixed, unless the counter is
> > away at the moment of testing, but you should be able to recheck it for
> > a good value. (please let me know about this)
>
> That's good news. However as I said we're not using this system to graph
> out systems perfmons as it was originally planed and the decision isn't
> mine. There's still some NC_Net instances here and there but I most
> likely won't have time to play with that anytime soon. And finally the
> most problematic counters are not monitored anymore because of some
> architectural changes (they're useless now).
>
> > I agree sbout the problem in recovering the Rate of time values, and
> > have a few notes about it,
> > does the accuracy of the value really mattrer. when it is sampled you
> > get a snapshot of the value at that moment, and with lots of
> > instintanious samples put into a RRD isnt that close enough to give a
> > good overview of the load?
> >
> > I would think many monitoring apps do not take into account that these
> > rates need to be prefetched in order to determin a rate/time for X min.
> > It is all a compromize of resources and since the Users are removed from
> > the details of the implementation they naturally assume it is what the
> > labele implies. to put it simple, a brief 1/2 sec check every 5 min of
> > a Time/rate value is NOT a 5 min average and in some cases it may not be
> > a good representation of the average.
>
> Well, the problem is dual-faced. Not only the time sampled is very sort,
> it's still a bit long to have a poller run trough all counters every
> minute. As a workaround we use Nagios to poll the counters and use a
> Nagios performance data caching daemon to hold the latest value so that
> Cacti can fetch them in almost no time (In our system we get samples
> every minute - that's also the precision of our RRDs).
>
> The real source of the problem is Windows that cannot provide real
> counters from which you can derive precise values by calculating the
> time between two polls. Since Windows can't be fixed the only solution
> left is a daemon that do the work for us.
>
> > I think in some cases this does matter, thats why nc_net implements
> > CPULOAD in the way you described, it keeps an internal RRD of the CPU
> > (_TOTAL) with the time between samples configurable in the
> > config,roughly 12 times a min. then when CPU load is requested it
> > calculates the average.
> >
> > I have also thought of a similar Proxie for windows that basically does
> > what you describe, although i have not had funding to beging working on
> > it. one of the issues is that most people assume if you can retrieve
> > the counter or the value from WMI then its OK, and they forget that
> > microsoft somtimes is a bit misleading.
>
> Yes that's another issue. I understood the problem by reading M$ kb
> articles about the different ways to poll performance data but I don't
> expect many people will even find (or look for) them :)
>
> A good start could be explaining this issue in your NC_Net page and hope
> NC_Net users will read it...
>
> Thomas
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFHBi2E6dZ+Kt5BchYRAiMIAJ41P+dG6kTfsM+p6w8XwbAlBJtmEACcD0u6
> dTFltlPu7vDnqGryeESLsBs=
> =7/b0
> -----END PGP SIGNATURE-----
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-plugins.org/archive/help/attachments/20071005/ba20b7bd/attachment.html>
More information about the Help
mailing list