monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Fine Tunning a monit configuration


From: Martin Pala
Subject: Re: [monit] Fine Tunning a monit configuration
Date: Tue, 24 Mar 2009 23:46:12 +0100
User-agent: Thunderbird 2.0.0.21 (X11/20090318)

You can run monit in verbose mode and verify what happened:

  monit -vI

The described problem could be startup synchronization bug in monit which was solved in upcoming monit-5.0.

You can get monit-5.0_beta7 here:
http://www.mmonit.com/monit/download/

To create binary RPM from source distribution it should be sufficient to download it and run:

  rpmbuild -tb monit 5.0_beta7.tar.gz



Martin



Andres Tarallo wrote:
I have a bunch of Centos 5.2 Servers running Apache, I've installed monit 4.9 (RPMs from DAG repository). These servers are heavily loaded most of the day (average 1 min over 20, many hours a day). I keep getting the following messages in my Mailbox:

** Subject httpd Timeout - httpd unmonitor on XXXXX: 'httpd' service timed out and will not be checked anymore. ** Subject httpd Connection failed - httpd restart on XXXXXX: 'httpd' failed protocol test [HTTP] at INET[WW.WW.WWW.ZZZ:80] via TCP. ** httpd Does not exist - httpd restart on XXXXXX: 'httpd' process is not running.

The last one really puzzles me, because Apache is actually running !!!!!

My monit configuration file

set daemon  180
set logfile syslog facility log_daemon
set mailserver mail.company.net <http://mail.company.net>
set mail-format { from: address@hidden <mailto:address@hidden>
subject: $SERVICE $EVENT
message: $SERVICE $ACTION on $HOST: $DESCRIPTION.}
set httpd port 2812 and
     use address localhost  # only accept connection from localhost
     allow localhost        # allow localhost to connect to the server and
check system XXXXXX
    if loadavg (1min) > 50 then alert
    if loadavg (5min) > 75 then alert
    if memory usage > 90% then alert
    if cpu usage (user) > 99% then alert
    if cpu usage (system) > 99% then alert
    if cpu usage (wait) > 99% then alert
    alert address@hidden <mailto:address@hidden>

check file apache_bin with path /usr/sbin/httpd
   group apache
   if failed checksum then unmonitor
   if failed permission 755 then unmonitor
   if failed uid root then unmonitor
   if failed gid root then unmonitor

check process httpd with pidfile /var/run/httpd.pid
    start program = "/etc/init.d/httpd start"  as uid 0 as gid 0
    stop program  = "/etc/init.d/httpd stop" as uid 0 as gid 0
    if cpu > 99% for 5 cycles then restart
    if loadavg(5min) greater than 45 for 3 cycles then restart
if failed host WW.WW.WWW.ZZZ port 80 protocol HTTP request "/site/page.php" timeout 15 seconds 10 cycles then restart
    if 6 restarts within 10 cycles then timeout
    alert address@hidden <mailto:address@hidden>
    depends on apache_bin
    group apache

I'm pulling my hair, It doesn't work flawlesly. I receive many alerts, even when the servers are working. Thanks


------------------------------------------------------------------------

--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]