If these processes have common parent process (like apache which spawns child processes), monit watches the parent process.
If your script starts three independent processes with parent being init (pid 1), then you will need some workaround. For example modify the start script to check that all processes are stopped before starting - if they are running, sleep 1 and check again.
We can most also modify monit to check all pids from pidfile.
split the configuration and starup script to three independent processes (which they really are)
On Mar 5, 2009, at 7:38 PM, Perdue, Emmett wrote: If a "program" that Monit controls has more than 1 PID and all of those are started from a single start script, but ALL must be stopped BEFORE the start command is issued on a restart... how is that done with Monit? Not every piece of software has just a single PID associated with it. Yes, monit reads only one pid from pidfile
On Mar 5, 2009, at 7:30 PM, Perdue, Emmett wrote: Process Name = jboss_eradbre Group = server Pid file = /opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid Monitoring mode = active Start program = '/etc/init.d/jboss_eradbre start' timeout 30 second(s) Stop program = '/etc/init.d/jboss_eradbre stop' timeout 30 second(s) Pid = if changed 1 times within 1 cycle(s) then alert Ppid = if changed 1 times within 1 cycle(s) then alert Timeout = If 3 restart within 5 cycles then unmonitor else if succeeded then alert
There are multiple PID's in the jboss_eradbre.pid file. 3 in this case. See below: $ cat /opt/local/software/jboss/jboss-4.0.5.GA/logs/jboss_eradbre.pid 9800 9981 10004
Could this be the problem? Could Monit be stopping the 1st PID and then issuing the start command without waiting on the 2nd and 3rd PID's to stop? If I run the /etc/init.d/jboss_eradbre itself, the problem does not happen, it only happens when Monit handles process.
If the service is process, monit execs the stop command and waits for the process with pid matching the pidfile content to stop. As soon as the process stops, start script is executed. If the process is stopping quickly, the start script can be executed very quickly (within the same second).
If the check is for different service type (like file, directory, host, etc.), then the stop script is executed followed by start immediately since monit has currently no way how to identify whether the stop script finished OK or not.
What is the configuration of jboss_eradbre service?
You can run monit with -v option to see details. On Mar 5, 2009, at 5:44 PM, Perdue, Emmett wrote: I am seeing some strange behavior from Monit when a restart command is issued. When I issue a "monit restart app_name" command, Monit is sending the stop and start commands in the monitrc file back to back within 1/10 of a second. It is not sending the stop command and waiting for it to finish before sending the start command. If I run the scripts outside of Monit, all is fine. What should I look for? Below is a snip of the Monit log from when the problem happens… [EST Mar 5 10:12:32] debug : restart service 'jboss_eradbre' on user request [EST Mar 5 10:12:32] info : monit daemon at 25448 awakened [EST Mar 5 10:12:32] info : Awakened by User defined signal 1 [EST Mar 5 10:12:32] info : 'jboss_eradbre' trying to restart [EST Mar 5 10:12:32] info : 'jboss_eradbre' stop: /etc/init.d/jboss_eradbre [EST Mar 5 10:12:33] info : 'jboss_eradbre' start: /etc/init.d/jboss_eradbre Thank You, Emmett D. Perdue CSX Corp. Sr. Systems Admin - RHCE Middleware Software Provisioning Phone: (904) 633-5187 RNX: 633-5187 E-Mail: address@hidden "Individuals Play the Game, But Teams Win Championships!"
This email transmission and any accompanying attachments may contain CSX privileged and confidential information intended only for the use of the intended addressee. Any dissemination, distribution, copying or action taken in reliance on the contents of this email by anyone other than the intended recipient is strictly prohibited. If you have received this email in error please immediately delete it and notify sender at the above CSX email address. Sender and CSX accept no liability for any damage caused directly or indirectly by receipt of this email. -- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
This email transmission and any accompanying attachments may contain CSX privileged and confidential information intended only for the use of the intended addressee. Any dissemination, distribution, copying or action taken in reliance on the contents of this email by anyone other than the intended recipient is strictly prohibited. If you have received this email in error please immediately delete it and notify sender at the above CSX email address. Sender and CSX accept no liability for any damage caused directly or indirectly by receipt of this email. -- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
This email transmission and any accompanying attachments may contain CSX privileged and confidential information intended only for the use of the intended addressee. Any dissemination, distribution, copying or action taken in reliance on the contents of this email by anyone other than the intended recipient is strictly prohibited. If you have received this email in error please immediately delete it and notify sender at the above CSX email address. Sender and CSX accept no liability for any damage caused directly or indirectly by receipt of this email. -- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
|