monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Monit does not alert with stop or unmonitor


From: Martin Pala
Subject: Re: Monit does not alert with stop or unmonitor
Date: Tue, 28 Oct 2014 10:22:07 +0100

The PID/PPID tests allow one test statement only - not multiple as other tests => you test works, but only one of the rules was applied (the second one). We can add support for multiple PID tests, as single test limitation is confusing.

Also note that your configuration checks the service once per 4 cycles due to the following line, which in fact are two independent statements:

--8<--
if changed pid then alert every 4 cycles
--8<--

is:

--8<--
if changed pid then alert #send alert if PID changed
every 4 cycles #check the service once per 4 cycles
--8<--


Regards,
Martin


On 27 Oct 2014, at 18:13, Stenver Jerkku <address@hidden> wrote:

Hey

I discovered that having multiple `if changed pid` statements in a row does not work. So as a workaround i just did
  if changed pid then alert

I discovered that my upstart script didnt have a respawn limit, so i fixed that and then when i did `timeout when too many restarts`, i got the proper email. 

I think its a bug in Monit, because i tested this quite extensively. :/

Lugupidamisega,

Stenver Jerkku
+372 53734335
Tartu Ăślikool Infotehnoloogia

On Mon, Oct 27, 2014 at 6:00 PM, Martin Pala <address@hidden> wrote:
Hello,

do you have the "set mailserver" statement + "set alert" in your configuration file? Also check if you don't filter out the mentioned events (see manual for more details: http://www.mmonit.com/monit/documentation/monit.html#setting_an_event_filter)

Regards,
Martin


On 24 Oct 2014, at 13:15, Stenver Jerkku <address@hidden> wrote:

Hello

Im having a problem, that monit does not send "stop" or "unmonitor" alert when it times out with restarts, PIDS ja PPIDS.

Here are relevant configurations:

  set alert address@hidden

  check process visit_registry with pidfile /var/run/visit_registry.pid
    group services
    start program = "/sbin/start visit_registry"
    stop program = "/sbin/stop visit_registry"
    if changed pid then alert every 4 cycles
    if changed pid 4 times within 8 cycles then stop
    if changed ppid 4 times within 8 cycles then stop
    if 3 restarts within 6 cycles then stop
    if cpu > 60% for 2 cycles then alert
    if cpu > 80% for 5 cycles then restart
    if totalmem > 1000.0 MB for 5 cycles then alert
    if totalmem > 1500.0 MB for 5 cycles then alert
    if totalmem > 2000.0 MB for 5 cycles then restart

In the log, i can see:

[UTC Oct 24 10:55:07] error    : 'visit_registry' process PID changed from 10533 to 16947
[UTC Oct 24 10:55:08] info     : 'visit_registry' stop: /sbin/stop

But i never get an alert that monit stopped or unmonitored. I also tried using timeout or unmonitor instead of stop.

Monit version 5.6

Sincerely,

Stenver Jerkku
Salemove Inc
Tartu university, Software Engineering
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general


--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general


reply via email to

[Prev in Thread] Current Thread [Next in Thread]