monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] monit fails mysteriously on refresh after upgrade from Debia


From: Martin Pala
Subject: Re: [monit] monit fails mysteriously on refresh after upgrade from Debian etch to lenny
Date: Tue, 24 Feb 2009 21:04:54 +0100
User-agent: Thunderbird 2.0.0.19 (X11/20090105)

The configuration looks good. The problem is very strange - Monit cannot die this way without either crash or receiving signal to stop.

Please can you run monit with "-vI" options? The -v enable verbose as you did + the -I will let in run in foreground.


1.) stop monit

2.) enable coredumps:
ulimit -c unlimited

3.) start monit verbose/foreground:
monit -vI


... you will see the same verbose output, but in terminal. If monit will crash or receive signal to stop, you will get message about it (as well if there will be any other error).


Thanks,
Martin


Jenny Hopkins wrote:
2009/2/23 Jan-Henrik Haukeland <address@hidden>:
Doesn't sound like Monit crash, it just runs one cycle and then dies. The
obvious explanation is that Monit was not started in daemon mode and that
this is a configure issue.

So can you verify that you have a "set daemon x" in the monitrc file, where
the x represents a number?  Also that you use the receipt for running Monit
from init described here, http://mmonit.com/wiki/Monit/FAQ#init



Yes, I do have the set daemon x line in the config - I've tried x as
180, 120 20....

I've tried running monit from /etc/init.d and also from a command line.

If yes, could you please provide us with the output from Monit's log file
when you run Monit in debug mode (the -v switch to the program) and your
monitrc file?

Below is the debug info I see on screen when i run the command below:
it shows the contents of the monitrc file and comments on it.
Let me know if the monitrc file would be helpful too, but this seems
to be the same with more.  below that is the not very helpful log
file.

Sorry for long post.

Many thanks,

Jenny

/usr/sbin/monit  -v -c /etc/monit/monitrc -s /var/lib/monit/monit.state
monit: Debug: Adding credentials for user 'foo'.
Runtime constants:
 Control file       = /etc/monit/monitrc
 Log file           = /var/log/monit/monit.log
 Pid file           = /var/run/monit.pid
 Debug              = True
 Log                = True
 Use syslog         = False
 Is Daemon          = True
 Use process engine = True
 Poll time          = 10 seconds
 Mail server(s)     = localhost:25, courthouse.foo.co.uk:25
 Mail from          = address@hidden
 Mail subject       = monit alert --  $EVENT $SERVICE
 Mail message       = $EVENT Service $SERV..(truncated)
 Start monit httpd  = True
 httpd bind address = Any/All
 httpd portnumber   = 2812
 httpd signature    = True
 Use ssl encryption = False
 httpd auth. style  = Basic Authentication
 Alert mail to      = address@hidden
   Alert on         = All events

The service list contains the following entries:

Process Name          = apache
 Group                = server
 Pid file             = /var/run/apache2.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/apache2 start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/apache2 stop' timeout 1 cycle(s)
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Port                 = if failed www.foo.co.uk:80/ [HTTP via TCP]
with timeout 30 seconds 1 times within 1 cycle(s) then restart else if
passed 1within 1 cycle(s) then alert
 Port                 = if failed www.foo.co.uk:80/monit/test [HTTP
via TCP] with timeout 5 seconds 1 times within 1 cycle(s) then alert
else if p times within 1 cycle(s) then alert
 Load avg. (5min)     = if greater than 10.0 8 times within 8 cycle(s)
then alert else if passed 1 times within 1 cycle(s) then alert
 CPU usage limit      = if greater than 60.0% 2 times within 2
cycle(s) then alert else if passed 1 times within 1 cycle(s) then
alert
 Timeout              = If 3 restart within 5 cycles then unmonitor
else if passed then alert

Process Name          = munin-node
 Pid file             = /var/run/munin/munin-node.pid
 Monitoring mode      = active
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Port                 = if failed localhost:4949 [DEFAULT via TCP]
with timeout 5 seconds 1 times within 1 cycle(s) then restart else if
passed 1 timin 1 cycle(s) then alert

Process Name          = sshd
 Pid file             = /var/run/sshd.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/ssh start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/ssh stop' timeout 1 cycle(s)
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Port                 = if failed localhost:22 [SSH via TCP] with
timeout 5 seconds 1 times within 1 cycle(s) then restart else if
passed 1 times witycle(s) then alert
 Timeout              = If 5 restart within 5 cycles then unmonitor
else if passed then alert

Process Name          = mysql
 Group                = database
 Pid file             = /var/run/mysqld/mysqld.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/mysql start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/mysql stop' timeout 1 cycle(s)
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Port                 = if failed 127.0.0.1:3306 [DEFAULT via TCP]
with timeout 5 seconds 1 times within 1 cycle(s) then restart else if
passed 1 timin 1 cycle(s) then alert
 Timeout              = If 5 restart within 5 cycles then unmonitor
else if passed then alert

Process Name          = clamavd
 Group                = virus
 Pid file             = /var/run/clamav/clamd.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/clamav-daemon start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/clamav-daemon stop' timeout 1 cycle(s)
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Unix Socket          = if failed /var/run/clamav/clamd.ctl [protocol
DEFAULT] with timeout 5 seconds 1 times within 1 cycle(s) then restart
else if 1 times within 1 cycle(s) then alert
 Timeout              = If 5 restart within 5 cycles then unmonitor
else if passed then alert

Process Name          = exim4
 Group                = mail
 Pid file             = /var/run/exim4/exim.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/exim4 start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/exim4 stop' timeout 1 cycle(s)
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Timeout              = If 5 restart within 5 cycles then unmonitor
else if passed then alert

Process Name          = cron
 Group                = system
 Pid file             = /var/run/crond.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/cron start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/cron stop' timeout 1 cycle(s)
 Depends on Service   = cron_bin
 Depends on Service   = cron_init
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert

File Name             = cron_init
 Group                = system
 Path                 = /etc/init.d/cron
 Monitoring mode      = active

File Name             = cron_bin
 Group                = system
 Path                 = /usr/sbin/cron
 Monitoring mode      = active

Process Name          = freshclam
 Group                = virus
 Pid file             = /var/run/clamav/freshclam.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/clamav-freshclam start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/clamav-freshclam stop' timeout 1 cycle(s)
 Depends on Service   = freshclam_bin
 Depends on Service   = freshclam_init
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert

File Name             = freshclam_init
 Group                = virus
 Path                 = /etc/init.d/clamav-freshclam
 Monitoring mode      = active

File Name             = freshclam_bin
 Group                = virus
 Path                 = /usr/bin/freshclam
 Monitoring mode      = active

Process Name          = spamd
 Group                = mail
 Pid file             = /var/run/spamd/spamd.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/spamassassin start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/spamassassin stop' timeout 1 cycle(s)
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Memory usage limit   = if greater than 99.0% 5 times within 5
cycle(s) then alert else if passed 1 times within 1 cycle(s) then
alert
 CPU usage limit      = if greater than 99.0% 5 times within 5
cycle(s) then alert else if passed 1 times within 1 cycle(s) then
alert
 Timeout              = If 5 restart within 5 cycles then unmonitor
else if passed then alert

Process Name          = dovecot
 Group                = mail
 Pid file             = /var/run/dovecot/master.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/dovecot start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/dovecot stop' timeout 1 cycle(s)
 Depends on Service   = dovecot_bin
 Depends on Service   = dovecot_init
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Port                 = if failed localhost:143 [IMAP via TCP] with
timeout 5 seconds 1 times within 1 cycle(s) then restart else if
passed 1 times w cycle(s) then alert
 Timeout              = If 5 restart within 5 cycles then unmonitor
else if passed then alert

File Name             = dovecot_init
 Group                = mail
 Path                 = /etc/init.d/dovecot
 Monitoring mode      = active

File Name             = dovecot_bin
 Group                = mail
 Path                 = /usr/sbin/dovecot
 Monitoring mode      = active

Process Name          = syslog
 Group                = service
 Pid file             = /var/run/syslogd.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/sysklogd start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/sysklogd stop' timeout 1 cycle(s)
 Depends on Service   = syslogd_logfile
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Every                = Check service every 12 cycles
 Timeout              = If 3 restart within 5 cycles then unmonitor
else if passed then alert

File Name             = syslogd_logfile
 Group                = service
 Path                 = /var/log/syslog
 Monitoring mode      = active
 Timestamp            = if greater than 3900 second(s) 1 times within
1 cycle(s) then alert else if passed 1 times within 1 cycle(s) then
alert
 Every                = Check service every 48 cycles

Process Name          = xinetd
 Group                = network
 Pid file             = /var/run/xinetd.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/xinetd start' timeout 1 cycle(s)
 Stop program         = '/etc/init.d/xinetd stop' timeout 1 cycle(s)
 Depends on Service   = xinetd_bin
 Depends on Service   = xinetd_init
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert

File Name             = xinetd_init
 Group                = network
 Path                 = /etc/init.d/xinetd
 Monitoring mode      = active

File Name             = xinetd_bin
 Group                = network
 Path                 = /usr/sbin/xinetd
 Monitoring mode      = active

System Name           = stoneboat.foo.co.uk
 Monitoring mode      = active

-------------------------------------------------------------------------------
Starting monit daemon with http interface at [*:2812]



 more monit.log

[GMT Feb 23 21:41:57] info     : Starting monit daemon with http
interface at [*:2812]
[GMT Feb 23 21:41:57] info     : Starting monit HTTP server at [*:2812]
[GMT Feb 23 21:41:57] info     : monit HTTP server started
[GMT Feb 23 21:41:57] info     : Monit started
[GMT Feb 23 21:41:57] debug    : Monit instance changed notification
is sent to address@hidden
[GMT Feb 23 21:41:58] debug    : 'apache' zombie check passed [status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'apache' loadavg(5min) check passed
[current loadavg(5min)=1.1]
[GMT Feb 23 21:41:58] debug    : 'apache' cpu usage check passed
[current cpu usage=0.0%]
[GMT Feb 23 21:41:58] debug    : 'apache' succeeded connecting to
INET[www.foo.co.uk:80] via TCP
[GMT Feb 23 21:41:58] debug    : 'apache' succeeded testing protocol
[HTTP] at INET[www.foo.co.uk:80] via TCP
[GMT Feb 23 21:41:58] debug    : 'apache' succeeded connecting to
INET[www.foo.co.uk:80] via TCP
[GMT Feb 23 21:41:58] debug    : 'apache' succeeded testing protocol
[HTTP] at INET[www.foo.co.uk:80] via TCP
[GMT Feb 23 21:41:58] debug    : 'munin-node' zombie check passed
[status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'munin-node' succeeded connecting to
INET[localhost:4949] via TCP
[GMT Feb 23 21:41:58] debug    : 'munin-node' succeeded testing
protocol [DEFAULT] at INET[localhost:4949] via TCP
[GMT Feb 23 21:41:58] debug    : 'sshd' zombie check passed [status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'sshd' succeeded connecting to
INET[localhost:22] via TCP
[GMT Feb 23 21:41:58] debug    : 'sshd' succeeded testing protocol
[SSH] at INET[localhost:22] via TCP
[GMT Feb 23 21:41:58] debug    : 'mysql' zombie check passed [status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'mysql' succeeded connecting to
INET[127.0.0.1:3306] via TCP
[GMT Feb 23 21:41:58] debug    : 'mysql' succeeded testing protocol
[DEFAULT] at INET[127.0.0.1:3306] via TCP
[GMT Feb 23 21:41:58] debug    : 'clamavd' zombie check passed
[status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'clamavd' succeeded connecting to
UNIX[/var/run/clamav/clamd.ctl]
[GMT Feb 23 21:41:58] debug    : 'clamavd' succeeded testing protocol
[DEFAULT] at UNIX[/var/run/clamav/clamd.ctl]
[GMT Feb 23 21:41:58] debug    : 'exim4' zombie check passed [status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'cron_init' file existence check passed
[GMT Feb 23 21:41:58] debug    : 'cron_init' is regular file
[GMT Feb 23 21:41:58] debug    : 'cron_bin' file existence check passed
[GMT Feb 23 21:41:58] debug    : 'cron_bin' is regular file
[GMT Feb 23 21:41:58] debug    : 'freshclam_init' file existence check passed
[GMT Feb 23 21:41:58] debug    : 'freshclam_init' is regular file
[GMT Feb 23 21:41:58] debug    : 'freshclam_bin' file existence check passed
[GMT Feb 23 21:41:58] debug    : 'freshclam_bin' is regular file
[GMT Feb 23 21:41:58] debug    : 'spamd' zombie check passed [status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'spamd' mem usage check passed
[current mem usage=1.3%]
[GMT Feb 23 21:41:58] debug    : 'spamd' cpu usage check passed
[current cpu usage=0.0%]
[GMT Feb 23 21:41:58] debug    : 'dovecot_init' file existence check passed
[GMT Feb 23 21:41:58] debug    : 'dovecot_init' is regular file
[GMT Feb 23 21:41:58] debug    : 'dovecot_bin' file existence check passed
[GMT Feb 23 21:41:58] debug    : 'dovecot_bin' is regular file
[GMT Feb 23 21:41:58] debug    : 'xinetd_init' file existence check passed
[GMT Feb 23 21:41:58] debug    : 'xinetd_init' is regular file
[GMT Feb 23 21:41:58] debug    : 'xinetd_bin' file existence check passed
[GMT Feb 23 21:41:58] debug    : 'xinetd_bin' is regular file
[GMT Feb 23 21:41:58] debug    : 'cron' zombie check passed [status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'freshclam' zombie check passed
[status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'dovecot' zombie check passed
[status_flag=0000]
[GMT Feb 23 21:41:58] debug    : 'dovecot' succeeded connecting to
INET[localhost:143] via TCP
[GMT Feb 23 21:41:58] debug    : 'dovecot' succeeded testing protocol
[IMAP] at INET[localhost:143] via TCP
[GMT Feb 23 21:41:58] debug    : 'xinetd' zombie check passed [status_flag=0000]
[GMT Feb 23 21:42:09] info     : Monit has not changed


--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]