monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: monit unresponsive to status requests?


From: Randy Bias
Subject: Re: monit unresponsive to status requests?
Date: Fri, 15 Jul 2005 14:53:30 -0700

On Jul 15, 2005, at 6:27 AM, Jan-Henrik Haukeland wrote:
On 15. jul. 2005, at 09.51, Marco Ermini wrote:
Same for me. I don't want OpenView to trigger errors because Monit is
not responsive...
Don't blame me but this is not so "minor" problem IMHO. If you have
first level support 24/7/365 it isn't nice to trigger them uneeded
critical alarms...
Ok, I understand and I can see the problem. I have added this as a task that we need to fix in the next release of monit. See http://www.tildeslash.com/monit/doc/next.php#23.

Great.  That's excellent.

Actually, this reminds me of another question, that should arguably be in a separate e-mail: Have you considered having monit handle multiple
PIDs in a file or multiple PID files for a single "service" that has
multiple processes.  For example, nfsd typically starts up with
multiple processes.  It would be nice to tell monit:  make sure there
are always 8 of these running and gave it 8 PIDs to check. This seems
like a very useful feature.
This is what OpenView already did... an this is another reason because
I can't scale Monit in a bigger environment, it is confined to minor
projects by now.

I'm not sure how we should go about solving this problem. If you have any ideas, please share.

It seems straightforward to me, but then again, I'm not a heavy developer. Turn the integer storage of the PID into a storage of an array of integers 1+ PIDs up to N. Put loops around each section of code that normally assumes one PID. Loop through each PID and perform the appropriate operation. Keep a counter of success. Match it against the specified number of running processes. If we've dropped below the number of specified minimum PIDs, restart the service.

Now, that said, I haven't gone through the monit code very deeply. Mostly just scanning through validate.c. So it may architecturally be an issue.

Look forward to your response.



--Randy





reply via email to

[Prev in Thread] Current Thread [Next in Thread]