I have a bunch of Centos 5.2 Servers running Apache, I've installed monit 4.9 (RPMs from DAG repository). These servers are heavily loaded most of the day (average 1 min over 20, many hours a day). I keep getting the following messages in my Mailbox:
** Subject httpd Timeout - httpd unmonitor on XXXXX: 'httpd' service timed out and will not be checked anymore.
** Subject httpd Connection failed - httpd restart on XXXXXX: 'httpd' failed protocol test [HTTP] at INET[WW.WW.WWW.ZZZ:80] via TCP.
** httpd Does not exist - httpd restart on XXXXXX: 'httpd' process is not running.
The last one really puzzles me, because Apache is actually running !!!!!
My monit configuration file
set daemon 180
set logfile syslog facility log_daemon
set mailserver mail.company.net <http://mail.company.net>
set mail-format { from: address@hidden <mailto:address@hidden>
subject: $SERVICE $EVENT
message: $SERVICE $ACTION on $HOST: $DESCRIPTION.}
set httpd port 2812 and
use address localhost # only accept connection from localhost
allow localhost # allow localhost to connect to the server and
check system XXXXXX
if loadavg (1min) > 50 then alert
if loadavg (5min) > 75 then alert
if memory usage > 90% then alert
if cpu usage (user) > 99% then alert
if cpu usage (system) > 99% then alert
if cpu usage (wait) > 99% then alert
alert address@hidden <mailto:address@hidden>
check file apache_bin with path /usr/sbin/httpd
group apache
if failed checksum then unmonitor
if failed permission 755 then unmonitor
if failed uid root then unmonitor
if failed gid root then unmonitor
check process httpd with pidfile /var/run/httpd.pid
start program = "/etc/init.d/httpd start" as uid 0 as gid 0
stop program = "/etc/init.d/httpd stop" as uid 0 as gid 0
if cpu > 99% for 5 cycles then restart
if loadavg(5min) greater than 45 for 3 cycles then restart
if failed host WW.WW.WWW.ZZZ port 80 protocol HTTP request "/site/page.php" timeout 15 seconds 10 cycles then restart
if 6 restarts within 10 cycles then timeout
alert address@hidden <mailto:address@hidden>
depends on apache_bin
group apache
I'm pulling my hair, It doesn't work flawlesly. I receive many alerts, even when the servers are working. Thanks
------------------------------------------------------------------------
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general