[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: monit unresponsive to status requests?

From: Randy Bias
Subject: Re: monit unresponsive to status requests?
Date: Thu, 14 Jul 2005 20:09:01 -0700


Thanks for the quick reply. The application in question is a database. In this particular instance monit is operating in conjunction with some clustering software. The cluster management tools talk to monit to bring up, shutdown services as appropriate. So right now I'm working around this problem by looping waiting for monit to return something useful, but it would definitely be nice if monit returned something even if it was a simple "monit is busy; try again later". Heck, I would even take a negative return code (1 in the shell) from monit. Right now it hangs for several seconds, prints cannot read status from daemon, and gives me an exit code of 0 ("true" in the shell).

I'm working around it, but again, it would be nice if it were a bit more interactive while working on starting/stopping processes, especially in an environment like ours where you may have other intelligent managers using monit as the interface for service control.

Actually, this reminds me of another question, that should arguably be in a separate e-mail: Have you considered having monit handle multiple PIDs in a file or multiple PID files for a single "service" that has multiple processes. For example, nfsd typically starts up with multiple processes. It would be nice to tell monit: make sure there are always 8 of these running and gave it 8 PIDs to check. This seems like a very useful feature.



On Jul 14, 2005, at 4:03 PM, Jan-Henrik Haukeland wrote:

When monit is asked to stop a process it will wait for the process to stop before it continues. If the process takes a long time to stop, monit will hang for up to 1 poll cycle before it timeout waiting for the process. The only reason this is sequential is for the case that monit is asked to restart a process. Monit restart a process by first calling the program's stop-command before calling the start-command - so to make sure that the process is stopped before the start-command is called, monit waits. Starting a new process is for instance non-sequential and is conducted in a new thread since this is considered "safe".

Unfortunately, when monit wait on a process to stop, the HTTP thread also stops, since all this stuff is run from the http thread. You may call this a minor design problem and you would be right. There are ways to work around this in the code, e.g. using a new thread, non-blocking techniques, a queue or flags and stuff like that, but we haven't really considered this to be a major problem before, because monit does work and will behave functional correct, i.e. with regards to monitoring.

Can I ask you, what kind of program you are stopping that take so long to stop?


On 14. jul. 2005, at 22.44, Randy Bias wrote:


I'm new to monit, but have a hopefully quick and easy question. I've noticed that sometimes monit occasionally becomes unresponsive to status requests, giving an error as follows:

    monit: cannot read status from the monit daemon

The monit process is running and has not died, but is unresponsive to the status request. Making a local connection attempt to the webserver socket seems to work, although monit takes forever to respond after the socket is open.

As far as I can tell this happens after monit is given any kind of directive. For example, if I tell monit to stop a service it becomes unresponsive for up to 10 or 15 seconds.

    Has anyone else noticed this behavior?

    Environment is Fedora Core 3, 2.6.9 kernel.



Randy Bias        randy-at-netenrich-dot-com
Director of Application Engineering & Support

To unsubscribe:

To unsubscribe:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]