[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bugs on status notifications
Re: Bugs on status notifications
Fri, 25 Nov 2005 23:28:30 +0100
On 22. nov. 2005, at 19.24, Don Parker wrote:
I have 4 processes I am monitoring, and each has a rule similar to:
If failed port 1234 with timeout 15 seconds for 3 times
within 5 cycles then restart
The rules are identical for the 4 processes except for the ports
When I look at them through the web server I see that 2 of
them have augmented the rule I placed in the rc file with "
else if passed 1 times within 1 cycle(s) then alert " which
is creating a lot of records in my log I do not want to see.
Do you run with the -v debug option? If so, please turn this off, it
will flood the log file and should only be used when you really need
I have no idea where this came from - it is not in my rc file.
Ah yes, maybe it is better for Martin to chip in here since he is
responsible for most of the event system and can give a better
explanation of the rationale behind the system. The thing is, AFAIK,
the else clausal is automatically added to every if-test if none was
specified. This is done so if a test fails you get an alert both when
the service is going down _and_ when it comes back again. The idea is
that this information can be useful, lets say a service goes down in
the middle of the night and you get an SMS from monit. Right after,
monit will send you another SMS (if so configured) that the service
is back online again, if it managed to fix the problem. That way, you
can go back to sleep again.
The automatically up notification is also very useful when we finally
get m/monit released, since m/monit collects both up and down alerts
and can display them in a nice statistical diagram which can be
useful for historical and SLA reasons.
I have set up logging to a file rather than to use syslog,
and I launch Monit through "integration with init". I see
all entries being sent to the log also being echoed on tty1.
Not a big deal, but not expected either.
This is a "side-effect" since we want to have a tty connected to
monit for debug purposes when run in non-daemon mode. To see what I
mean, try to run monit from the console like so; "monit -v validate"
which will run monit once and print out lots of interesting debug
info to the console. Anyway to turn this off when run from init, do
the standard file descriptor redirect like so, "monit -I 2>&1 1>/dev/
This is worse that I thought. I tried to turn off the alerts by adding
my own "else if passed then exec <something>" clause. While my exec
get executed I still see the alerts on passed tests.
Every if-test raise an alert also if you have specified another
action in the "then" clausal, such as an exec. To turn off alerts
simple remove the "set alert.." statement from the monitrc file and
optionally only add the "alert .." statement to the services you want
to raise alerts. Please see the manual for more information.
Q: Is there any relationships between timeouts and poll intervals? For
example, if my monitrc file has "set daemon 15" and I have a port
rule with a timeout of 30 seconds, does my port check rule really wait
30 seconds or does monit see it failing every 15 seconds?
Monit runs sequentially in a "validate->sleep->validate->sleep.."
pattern, where "validate" is the testing of all services mentioned in
the monitrc file. If you have a sleep 30 sec in a port check monit
really waits up to 30 seconds before timeout. This, of course means
that a poll-cycle time really is, (sleep-time + validate-time).
The good thing about open-source is that you can read the code and
see how monit really works and even send us patches if you want to
fix something :)
Mobil +47 97141255