|
From: | Fei, Yuming |
Subject: | RE: Monit not able to start after box reboot |
Date: | Mon, 11 Jun 2012 22:56:04 -0500 |
The same problem exists with the service
pid files too. Further more, it seems that the use of pidfiles in below, as
shown in many examples, does not really work: check process ntpd with pidfile
/var/run/ntpd.pid start program =
"/etc/init.d/ntpd start" stop program =
"/etc/init.d/ntpd stop" Because there is always a chance that the
process dies, and another process starts and obtains the same pid before monit
detects it. Monit may send out an alert if the ppid changes, but it will think
that the process is running and won’t run the start program. This may
happen at system reboot time but can also happen at run time, in which case,
the tmpfs solution may not help. If this is true, it may explain what I
have also seen: after system reboot, monit does not always start the processes… Maybe I am missing something here, but please
let me know. Yuming From: Fei, Yuming Thanks Martin, placing the pidfile on
tmpfs will help. So, is there a way to avoid this problem if placing the
pidfile on disk, or currently it shouldn’t be a disk file since it is
persistent across reboot? Yuming From: address@hidden
[mailto:address@hidden On Behalf Of Martin Pala If you run the monit binary, it checks whether the daemon is running
already by reading the PID from its pidfile and looking for the given process.
It seems that after your system rebooted, some other process obtained the same
PID, so monit thinks that it is running and doesn't daemonize itself. The solution could be to place the pidfile on tmpfs (memory based
filesystem), which is not persistent across reboot => the file will
disappear when you reboot the system. The pidfile location can be set with the "set pidfile"
statement. Regards, Martin On Jun 11, 2012, at 11:24 PM, Fei, Yuming wrote: Hi all, I have seen a problem that monit is not
able to start after the box reboot. The monit is run from init as daemon. This problem happens occasionally. When
it does happen, I saw these: First of all, the .monit.pid file was not
removed after the box shutdown. Then monit was started from init, but it
wrote “monit daemon at 12327 awakened” into its log, where 12327 is
the pid in .monit.pid. The monit startup process then exited and no monit
process run. Cleaning up the .monit.pid file helped:
restart monit after the cleaning up, then monit came up. Anyone has experienced this? What could
cause this to happen? Looking at monit’s source code, I
found these: (1) the removal function of the pid file
is registered in atexit(), however it may not be called if the process
terminates abnormally. Thus the pid file may not be removed, which is probably
what was seen above. (2) When monit starts up, it retrieves
the pid value from the pid file, and calls getpgid to check the result. But if
the process with that pid is running as a zombile, the getpgid checking will
pass and monit thinks that the monit daemon is running, and will “awake”
that daemon which is actually a zombie. The result is that monit daemon
won’t come up. It doesn’t seem that monit tests if
there is a monit zombie process during startup. Also, if by any chance there is a process
running with the same pid in the .monit.pid, monit will send a signal to it to
“awake” it …, and then it may kill that process. This is seen in monit 5.1.1, and seems to
be in the latest version 5.4 as well. Thanks Yuming CONFIDENTIALITY AND SECURITY NOTICE |
[Prev in Thread] | Current Thread | [Next in Thread] |