It seems that there is some sort of deadlock. To replicate it, you can
use following command:
unicorn:~/cvs/monit# i=9; while [ $i -ge 0 ] ; do ./monit -c
/etc/monitrc ; ./monit -c /etc/monitrc quit; i=$[i-1]; done
Console output:
Starting monit daemon
Starting httpd at [127.0.0.1:2812]
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
monit daemon at 24755 awakened
monit daemon with pid [24755] killed
Starting monit daemon
Starting httpd at [127.0.0.1:2812]
monit daemon with pid [24775] killed
monit daemon at 24775 awakened
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
unicorn:~/cvs/monit# ./monit -c /etc/monitrc quit
monit daemon with pid [24775] killed
Log output:
[CEST Sep 17 22:45:13] Starting monit daemon
[CEST Sep 17 22:45:13] Starting httpd at [127.0.0.1:2812]
[CEST Sep 17 22:45:13] Shutting down monit HTTP server
[CEST Sep 17 22:45:13] monit daemon at 24755 awakened
[CEST Sep 17 22:45:13] Awakened by User defined signal 1
[CEST Sep 17 22:45:13] monit daemon at 24755 awakened
[CEST Sep 17 22:45:13] Awakened by User defined signal 1
[CEST Sep 17 22:45:13] Awakened by User defined signal 1
[CEST Sep 17 22:45:13] monit daemon at 24755 awakened
[CEST Sep 17 22:45:13] Awakened by User defined signal 1
[CEST Sep 17 22:45:13] monit daemon at 24755 awakened
[CEST Sep 17 22:45:14] Awakened by User defined signal 1
[CEST Sep 17 22:45:14] monit daemon at 24755 awakened
[CEST Sep 17 22:45:14] Awakened by User defined signal 1
[CEST Sep 17 22:45:14] monit daemon at 24755 awakened
[CEST Sep 17 22:45:14] Awakened by User defined signal 1
[CEST Sep 17 22:45:14] monit daemon at 24755 awakened
[CEST Sep 17 22:45:14] monit HTTP server stopped
[CEST Sep 17 22:45:14] monit daemon with pid [24755] killed
[CEST Sep 17 22:45:14] Starting monit daemon
[CEST Sep 17 22:45:14] Starting httpd at [127.0.0.1:2812]
[CEST Sep 17 22:45:14] Shutting down monit HTTP server
Monit under pid 24775 is frozen - you can send as many "monit quit"
commands as you want - it will not terminate. It seems that monit is
blocking in thread join:
(gdb) info threads
3 Thread 16386 (LWP 24777) 0x402896e6 in poll () from /lib/libc.so.6
2 Thread 32769 (LWP 24776) 0x402896e6 in poll () from /lib/libc.so.6
1 Thread 16384 (LWP 24775) 0x4002a354 in __pthread_sigsuspend ()
from /lib/libpthread.so.0
(gdb) bt
#0 0x4002a354 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1 0x4002a118 in __pthread_wait_for_restart_signal () from
/lib/libpthread.so.0
#2 0x400274dc in pthread_join () from /lib/libpthread.so.0
#3 0x0804e7d7 in monit_http (action=-4) at monit_http.c:131
#4 0x0804f17e in do_destroy (sig=15) at monitor.c:512
#5 0x4002d685 in __pthread_sighandler () from /lib/libpthread.so.0
#6 <signal handler called>
#7 0x4002d91b in read () from /lib/libpthread.so.0
#8 0x00000006 in ?? ()
#9 0x00000038 in ?? ()
#10 0x0805c651 in read_proc_file (
buf=0xbfffe880 "10233 (mozilla-bin) S 10232 690 690 0 -1 64 23 0 5
0 141 117 0 0 9 0 0 0 1843919 89362432 14890 4294967295 134512640
134720822 3221224560 3212834796 1080776422 0 0 4098 17453 3222590806 0
0 33 0\n0 0 0"..., buf_size=4096, name=0xc3 <Address 0xc3 out of bounds>,
pid=10233) at process/common.c:98
#11 0x0805de2b in get_process_info_sysdep (p=0x8083f30) at
process/sysdep_LINUX.c:134
#12 0x0805c780 in getdatafromproc (pid=10233, entry=0x80847e0) at
process/common.c:185
#13 0x0805e047 in initprocesstree_sysdep (reference=0xbffff988) at
process/sysdep_LINUX.c:265
#14 0x0804e985 in initprocesstree (reference=0xbffff988) at
monit_process.c:178
#15 0x08054937 in validate () at validate.c:136
#16 0x0804f295 in do_default () at monitor.c:562
#17 0x0804f0c3 in do_action (args=0xfff) at monitor.c:348
#18 0x0804eb33 in main (argc=3, argv=0xbffffa24) at monitor.c:115
(gdb) thread 2
[Switching to thread 2 (Thread 32769 (LWP 24776))]#0 0x402896e6 in
poll () from /lib/libc.so.6
(gdb) bt
#0 0x402896e6 in poll () from /lib/libc.so.6
#1 0x400278fe in __pthread_manager () from /lib/libpthread.so.0
#2 0x40291be7 in clone () from /lib/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 16386 (LWP 24777))]#0 0x402896e6 in
poll () from /lib/libc.so.6
(gdb) bt
#0 0x402896e6 in poll () from /lib/libc.so.6
#1 0x0805ae7a in socket_producer (server=7) at http/engine.c:473
#2 0x0805aa8b in start_httpd (port=0, backlog=10, bindAddr=0x807f988
"127.0.0.1") at http/engine.c:194
#3 0x0804e86d in thread_wrapper (arg=0x0) at monit_http.c:167
#4 0x40027bf0 in pthread_start_thread () from /lib/libpthread.so.0
#5 0x40291be7 in clone () from /lib/libc.so.6
(gdb) print stopped
$1 = 0
Martin
_______________________________________________
monit-dev mailing list
address@hidden
http://mail.nongnu.org/mailman/listinfo/monit-dev