monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Monit exec without alert


From: Martin Pala
Subject: Re: Monit exec without alert
Date: Sun, 5 Feb 2012 09:24:20 +0100

Hi Callum,

the event filter which you use is correct way to suppress particular alerts - 
the filter should be however set to match the test rule which generates the 
event, in the case "resource":

   alert address@hidden but not on { resource }

The "exec" event is generated when the program execution failed (for example 
start or stop program wasn't able to start the process).

The alert settings is for the whole service => if you want to receive the alert 
when the space usage exceeded some hard limit, you need to use two service 
entries - for example:

--8<--
check device cachefs_cleanup with path /foo/cache
       alert address@hidden but not on { resource }
       if space usage > 90% then exec "/bar/reduce_cache.sh" as uid apache and 
gid apache
       if inode usage > 80% then exec "/bar/reduce_cache.sh" as uid apache and 
gid apache

check device cachefs_alarm with path /foo/cache
       if space usage > 97% for 5 cycles then alert
       if inode usage > 90% for 5 cycles then alert
--8<--

Regards,
Martin


On Feb 3, 2012, at 5:46 PM, Callum Macdonald wrote:

> I'm using monit to monitor our tmpfs filesystems which are used for
> caching purposes. I've written a script which deletes the oldest 10% of
> files, according to their atime.
> 
> I've configured monit to run that script with exec when the filesystem
> usage reaches 90%, like this:
> 
> check device cachefs with path /foo/cache
>        alert address@hidden but not on { exec }
>        if space usage > 90% then exec "/bar/reduce_cache.sh" as uid
> apache and gid apache
>        if inode usage > 80% then exec "/bar/reduce_cache.sh" as uid
> apache and gid apache
>        if space usage > 97% for 5 cycles then alert
>        if inode usage > 90% for 5 cycles then alert
> 
> This node is running monit 4.10.1.
> 
> Currently I get emails like this:
> * Resource limit matched Service cachefs
> * Resource limit passed Service cachefs
> 
> I get two emails every time, one when the exec is run, then a second
> when the test recovers. I want to disable both of those emails, for this
> filesystem check only.
> 
> I *do* want to get an email if the script fails and filesystem usage
> hits 97% or more.
> 
> Can I do something like:
>       if space usage > 90% then exec "/foo/reduce_cache.sh" not alert
> 
> I've read the docs but can't figure out the syntax. As I read the
> documentation, alerts can be enabled for the whole filesystem check or
> not at all, but not per test. Is that correct?
> 
> Do I need to create two different checks? Is there any danger in having
> two monit checks monitoring the same filesystem?
> 
> Alternatively, is using monit's exec the "wrong" approach to control
> cache usage? I figured it was simpler and possibly more reliable than a
> cron script which parses the output of `df` and calls the script when
> required.
> 
> Thanks in advance for any input.
> 
> Love & joy - Callum.
> 
> ==
> Callum Macdonald
> 
> French mobile: +33 7 8708 5410
> UK mobile: +44 7968 378 810
> Desk: +44 845 126 0875
> www.callum-macdonald.com
> 
> 
> 
> 
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]