monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Monit switches to "not monitored" state occasionally


From: Christopher Johnston
Subject: Re: Monit switches to "not monitored" state occasionally
Date: Fri, 13 Jan 2012 09:30:21 -0500

I can upgrade one of my dev environments tonight to 5.3.2, I read the release notes and saw one of the fixes I think was in 5.3.1 around a speed up to finding monitored apps.   Should have some feedback early next week from our ops guys after they do a few restarts.

On Fri, Jan 13, 2012 at 9:24 AM, Martin Pala <address@hidden> wrote:
Christian, can you please provide full monit logs for the timeframe when some of these problems occurred and monit configuration?

Please can you try this upgrade some of your systems to monit 5.3.2 and run it in verbose mode? (add -v option). The mentioned fix of the monitoring-mode-while-restart-is-pending may be related to the problem.

Regarding the PPID error - it was probably generated because monit had problem to collect the process data. The monit logs should provide more informations.

Regards,
Martin


On Jan 13, 2012, at 3:11 PM, Christopher Johnston wrote:

Martin,

I actually see this happen a lot as well on on my systems where we restart a large number of apps on a daily code drop (sometimes 100s of systems X 6 apps per box).  Some apps will go to an unmonitored state yet the application is still up and running and the pid file has a matching pid.  The only way I have been able to resolve is to restart monit all together and manually monitor the app again. Causes a lot of grief with my ops guys.  

Here is another error string I also saw the other night where the pid magically changed from 507 to 0, only way to resolve has been to fully restart monit with the same procesure as above. 

I am using monit verison 5.2.5.

<27> Jan 11 17:55:15.547617 -05:00 prod005 monit[5484]: 'WEB01' process PPID changed from 507 to 0

-Chris

On Fri, Jan 13, 2012 at 9:01 AM, Martin Pala <address@hidden> wrote:

On Jan 13, 2012, at 2:45 PM, Johannes Bauer wrote:

> Hi Martin,
>
> On 13.01.2012 14:16, Martin Pala wrote:
>
>> you should check the monit logs - it will show why the service monitoring was disabled (whether it was some manual action, etc.).
>
> Well, monit is configured to log to syslog:
>
> set logfile syslog facility log_daemon
>
> And I can see that there are messages when monit starts, that the
> control file syntax is okay, but that's it. There's no indication
> whatsoever why the processes are in the unmonitored state -- this is
> actually why I'm asking: because the logs do not show anything out of
> the ordinary yet monit put all processes in the "unmonitored" state.
>
> Is there any automatic action which would cause monit to put a monitored
> child into "unmonitored" autonomically? If so, how can this mechanism be
> disabled?


There are two possible ways how the service can get unmonitored automatically:

1.) when the "if <x> restarts within <y> cycles then timeout" statement is used, the monit will unmonitor the service if this condition matches

2.) when you use dependency ("depends on <service>") and the parent service is stopped/unmonitored (aither via the timeout statement or manually by admin) - then the stop/unmonitor action cascades to the child services too.


Also Monit <= 5.2.5 *temporarily* displayed "Not monitored" while the service restart was pending - the monitoring state returned back to "Monitored" when the restart finished … this was fixed in Monit 5.3 as it was confusing and it displayes "Monitored" during restart too.

If none of the above cases matches your configuration, the most probable cause is, that somebody manually unmonitored/stopped the service via Monit.

Rergards,
Martin

--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general


--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general


reply via email to

[Prev in Thread] Current Thread [Next in Thread]