monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Re: Problem with monit's "not monitoring" status


From: Martin Pala
Subject: Re: [monit] Re: Problem with monit's "not monitoring" status
Date: Tue, 30 Mar 2010 00:02:41 +0200

The new behavior will allow to script it like this (pseudo code):

monit stop servicename
sleep 60
while (! `monit start servicename`); then sleep 5; done


=> the temporary error returned while other action is pending allows to wait e.g. extra 5s - as soon as previous action will complete, start will succeed

Regards,
Martin


On Mar 29, 2010, at 8:23 PM, Brian Gupta wrote:

Martin,

Ideally we'd like to see an option for monit to queue the start request rather than just failing. Is this doable?

For our use case, we are automating the restart like so: "sudo monit stop servicename; sleep 60; sudo monit start servicename". If what you are telling us is true, the monit stop command is occasionally taking more than 60 secs to complete. Am I understanding correctly?

Thanks,
Brian

On Mon, Mar 29, 2010 at 12:12 PM, Martin Pala <address@hidden> wrote:
Hi David,

the problem was caused by overlapping stop and start actions - when start action was planned during pending stop action (before stop action completed), the start action was ignored. The fix is available in svn (will return temporary error when action is attempted before previous action completed), i'll prepare fix release for you (there are more changes which needs to be finished and tested yet).


Best regards,
Martin



On Mar 29, 2010, at 5:15 PM, David Bristow wrote:

> Anybody have any ideas about this?  It's an intermittent problem for us.
>
> On Tue, Mar 23, 2010 at 12:47 PM, David Bristow <address@hidden> wrote:
>> We are running monit v5.1.1 now.  I have another incident where monit
>> status was showing a service down, but it should have been started.
>> Here are the details:
>>
>> /var/log/monit.log at around the right time:
>>
>> [EDT Mar 22 10:47:12] debug    : stop service 'backgroundrb' on user request
>> [EDT Mar 22 10:47:12] info     : monit daemon at 11112 awakened
>> [EDT Mar 22 10:47:32] error    : 'memcached_fragments' failed to start
>> [EDT Mar 22 10:47:32] info     : 'backgroundrb' stop:
>> /usr/local/bin/backgroundrb_wrapper
>> [EDT Mar 22 10:47:42] debug    : start service 'backgroundrb' on user request
>> [EDT Mar 22 10:47:42] info     : monit daemon at 11112 awakened
>> [EDT Mar 22 10:47:47] info     : 'backgroundrb' start action done
>> [EDT Mar 22 10:47:47] info     : Awakened by User defined signal 1
>>
>> This is the log entry from when I had to manually run "monit start
>> backgroundrb":
>>
>> [EDT Mar 22 10:59:08] debug    : start service 'backgroundrb' on user request
>> [EDT Mar 22 10:59:08] info     : monit daemon at 11112 awakened
>> [EDT Mar 22 10:59:08] info     : Awakened by User defined signal 1
>> [EDT Mar 22 10:59:08] info     : 'backgroundrb' start:
>> /usr/local/bin/backgroundrb_wrapper
>> [EDT Mar 22 10:59:20] info     : 'backgroundrb' start action done
>>
>> backgroundrb file is from /etc/monit.d/
>>
>> monit status output from when I got alerted about the service being down
>>
>> The server's /etc/monitrc
>>
>> Let us know if you need more.
>>
>> --
>> David Bristow <address@hidden>
>>
>
>
>
> --
> David Bristow <address@hidden>
>
>
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]