monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] can't monitor one of my filesystems


From: Martin Pala
Subject: Re: [monit] can't monitor one of my filesystems
Date: Wed, 12 May 2010 00:37:04 +0200

Thanks for output.

It seems that the reason could be, that the device is not mounted - it was not 
found in /etc/mtab. The statvfs() interface which is used to get filesystem 
usage needs path to object which is  on the filesystem to be tested - hence 
when device name is used, monit translates it to mountpoint using /etc/mtab. 
There is currently no fine-grained error message for this state and it is 
catched by the test itself which logs general "unable to read filesystem 
/dev/sda2 state".

Please can you check that /dev/sda2 is mounted and that it can be found in 
/etc/mtab?


On May 6, 2010, at 9:33 PM, zachlac wrote:

> 
> Here's the output of monit -vl.  I do not believe that it's a virtual
> machine.
> 
> [EDT May  6 15:28:57] debug    : monit: pidfile '/var/run/monit.pid' does
> not exist
> [EDT May  6 15:28:57] info     : Starting monit daemon with http interface
> at [www.***************.com:2812]
> [EDT May  6 15:28:57] info     : Starting monit HTTP server at
> [www.***************.com:2812]
> [EDT May  6 15:28:57] info     : monit HTTP server started
> [EDT May  6 15:28:57] info     : 'www' Monit started
> [EDT May  6 15:28:57] debug    : Monit instance changed notification is sent
> to address@hidden
> [EDT May  6 15:28:57] debug    : cannot open file /proc/32077/stat -- No
> such file or directory
> [EDT May  6 15:28:57] debug    : system statistic error -- cannot read
> /proc/32077/stat
> [EDT May  6 15:28:57] debug    : 'www' cpu wait usage check succeeded
> [current cpu wait usage=-1.0%]
> [EDT May  6 15:28:57] debug    : 'www' cpu system usage check succeeded
> [current cpu system usage=-1.0%]
> [EDT May  6 15:28:57] debug    : 'www' cpu user usage check succeeded
> [current cpu user usage=-1.0%]
> [EDT May  6 15:28:57] debug    : 'www' swap usage check succeeded [current
> swap usage=0.0%]
> [EDT May  6 15:28:57] debug    : 'www' mem usage check succeeded [current
> mem usage=34.8%]
> [EDT May  6 15:28:57] debug    : 'www' loadavg(5min) check succeeded
> [current loadavg(5min)=0.1]
> [EDT May  6 15:28:57] debug    : 'www' loadavg(1min) check succeeded
> [current loadavg(1min)=0.0]
> [EDT May  6 15:28:57] debug    : 'apache_bin' file existence check succeeded
> [EDT May  6 15:28:57] debug    : 'apache_bin' is a regular file
> [EDT May  6 15:28:57] debug    : 'apache_bin' has valid checksums
> [EDT May  6 15:28:57] debug    : 'apache_bin' permission check succeeded
> [current permission=0755]
> [EDT May  6 15:28:57] debug    : 'apache_bin' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'apache_bin' gid check succeeded [current
> gid=0]
> [EDT May  6 15:28:57] debug    : 'apache_rc' file existence check succeeded
> [EDT May  6 15:28:57] debug    : 'apache_rc' is a regular file
> [EDT May  6 15:28:57] debug    : 'apache_rc' has valid checksums
> [EDT May  6 15:28:57] debug    : 'apache_rc' permission check succeeded
> [current permission=0755]
> [EDT May  6 15:28:57] debug    : 'apache_rc' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'apache_rc' gid check succeeded [current
> gid=0]
> [EDT May  6 15:28:57] debug    : 'sendmail_bin' file existence check
> succeeded
> [EDT May  6 15:28:57] debug    : 'sendmail_bin' is a regular file
> [EDT May  6 15:28:57] debug    : 'sendmail_bin' has valid checksums
> [EDT May  6 15:28:57] debug    : 'sendmail_bin' permission check succeeded
> [current permission=6755]
> [EDT May  6 15:28:57] debug    : 'sendmail_bin' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'sendmail_bin' gid check succeeded [current
> gid=51]
> [EDT May  6 15:28:57] debug    : 'sendmail_rc' file existence check
> succeeded
> [EDT May  6 15:28:57] debug    : 'sendmail_rc' is a regular file
> [EDT May  6 15:28:57] debug    : 'sendmail_rc' has valid checksums
> [EDT May  6 15:28:57] debug    : 'sendmail_rc' permission check succeeded
> [current permission=0755]
> [EDT May  6 15:28:57] debug    : 'sendmail_rc' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'sendmail_rc' gid check succeeded [current
> gid=0]
> [EDT May  6 15:28:57] debug    : 'dovecot_bin' file existence check
> succeeded
> [EDT May  6 15:28:57] debug    : 'dovecot_bin' is a regular file
> [EDT May  6 15:28:57] debug    : 'dovecot_bin' has valid checksums
> [EDT May  6 15:28:57] debug    : 'dovecot_bin' permission check succeeded
> [current permission=0755]
> [EDT May  6 15:28:57] debug    : 'dovecot_bin' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'dovecot_bin' gid check succeeded [current
> gid=0]
> [EDT May  6 15:28:57] debug    : 'dovecot_rc' file existence check succeeded
> [EDT May  6 15:28:57] debug    : 'dovecot_rc' is a regular file
> [EDT May  6 15:28:57] debug    : 'dovecot_rc' has valid checksums
> [EDT May  6 15:28:57] debug    : 'dovecot_rc' permission check succeeded
> [current permission=0755]
> [EDT May  6 15:28:57] debug    : 'dovecot_rc' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'dovecot_rc' gid check succeeded [current
> gid=0]
> [EDT May  6 15:28:57] debug    : 'ntpd_bin' file existence check succeeded
> [EDT May  6 15:28:57] debug    : 'ntpd_bin' is a regular file
> [EDT May  6 15:28:57] debug    : 'ntpd_bin' has valid checksums
> [EDT May  6 15:28:57] debug    : 'ntpd_bin' permission check succeeded
> [current permission=0755]
> [EDT May  6 15:28:57] debug    : 'ntpd_bin' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'ntpd_bin' gid check succeeded [current
> gid=0]
> [EDT May  6 15:28:57] debug    : 'ntpd_rc' file existence check succeeded
> [EDT May  6 15:28:57] debug    : 'ntpd_rc' is a regular file
> [EDT May  6 15:28:57] debug    : 'ntpd_rc' has valid checksums
> [EDT May  6 15:28:57] debug    : 'ntpd_rc' permission check succeeded
> [current permission=0755]
> [EDT May  6 15:28:57] debug    : 'ntpd_rc' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'ntpd_rc' gid check succeeded [current
> gid=0]
> [EDT May  6 15:28:57] debug    : 'sshd_bin' file existence check succeeded
> [EDT May  6 15:28:57] debug    : 'sshd_bin' is a regular file
> [EDT May  6 15:28:57] debug    : 'sshd_bin' has valid checksums
> [EDT May  6 15:28:57] debug    : 'sshd_bin' permission check succeeded
> [current permission=0755]
> [EDT May  6 15:28:57] debug    : 'sshd_bin' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'sshd_bin' gid check succeeded [current
> gid=0]
> [EDT May  6 15:28:57] debug    : 'sshd_rc' file existence check succeeded
> [EDT May  6 15:28:57] debug    : 'sshd_rc' is a regular file
> [EDT May  6 15:28:57] debug    : 'sshd_rc' has valid checksums
> [EDT May  6 15:28:57] debug    : 'sshd_rc' permission check succeeded
> [current permission=0755]
> [EDT May  6 15:28:57] debug    : 'sshd_rc' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'sshd_rc' gid check succeeded [current
> gid=0]
> [EDT May  6 15:28:57] debug    : 'apache' zombie check succeeded
> [status_flag=0000]
> [EDT May  6 15:28:57] debug    : 'apache' loadavg(5min) check succeeded
> [current loadavg(5min)=0.1]
> [EDT May  6 15:28:57] debug    : 'apache' children check succeeded [current
> children=13]
> [EDT May  6 15:28:57] debug    : 'apache' total mem amount check succeeded
> [current total mem amount=263024kB]
> [EDT May  6 15:28:57] debug    : 'apache' cpu usage check skipped
> (initializing)
> [EDT May  6 15:28:57] debug    : [EDT May  6 15:28:57] debug    : 'apache'
> succeeded connecting to INET[www.***************.com:80] via TCP
> [EDT May  6 15:28:57] debug    : 'apache' succeeded testing protocol [HTTP]
> at INET[www.***************.com:80] via TCP
> [EDT May  6 15:28:57] debug    : 'sendmail' zombie check succeeded
> [status_flag=0000]
> [EDT May  6 15:28:57] debug    : 'sendmail' succeeded connecting to
> INET[localhost:25] via TCP
> [EDT May  6 15:28:57] debug    : 'sendmail' succeeded testing protocol
> [SMTP] at INET[localhost:25] via TCP
> [EDT May  6 15:28:57] debug    : 'dovecot' zombie check succeeded
> [status_flag=0000]
> [EDT May  6 15:28:57] debug    : 'dovecot' succeeded connecting to
> INET[localhost:993] via TCPSSL
> [EDT May  6 15:28:57] debug    : 'dovecot' succeeded testing protocol [IMAP]
> at INET[localhost:993] via TCPSSL
> [EDT May  6 15:28:57] debug    : 'ntp' zombie check succeeded
> [status_flag=0000]
> [EDT May  6 15:28:57] debug    : 'ssh' zombie check succeeded
> [status_flag=0000]
> [EDT May  6 15:28:57] debug    : 'datafs_sdb1' permission check succeeded
> [current permission=0640]
> [EDT May  6 15:28:57] debug    : 'datafs_sdb1' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'datafs_sdb1' gid check succeeded [current
> gid=6]
> [EDT May  6 15:28:57] debug    : 'datafs_sdb1' inode usage check succeeded
> [current inode usage=1.5%]
> [EDT May  6 15:28:57] debug    : 'datafs_sdb1' inode usage check succeeded
> [current inode usage=1.5%]
> [EDT May  6 15:28:57] debug    : 'datafs_sdb1' space usage check succeeded
> [current space usage=69.6%]
> [EDT May  6 15:28:57] debug    : 'datafs_sdb1' space usage check succeeded
> [current space usage=69.6%]
> [EDT May  6 15:28:57] debug    : 'swap_sdb2' permission check succeeded
> [current permission=0640]
> [EDT May  6 15:28:57] debug    : 'swap_sdb2' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'swap_sdb2' gid check succeeded [current
> gid=6]
> [EDT May  6 15:28:57] debug    : 'swap_sdb2' inode usage check succeeded
> [current inode usage=0.1%]
> [EDT May  6 15:28:57] debug    : 'swap_sdb2' inode usage check succeeded
> [current inode usage=0.1%]
> [EDT May  6 15:28:57] debug    : 'swap_sdb2' space usage check succeeded
> [current space usage=14.6%]
> [EDT May  6 15:28:57] debug    : 'swap_sdb2' space usage check succeeded
> [current space usage=14.6%]
> [EDT May  6 15:28:57] debug    : 'boot_sda1' permission check succeeded
> [current permission=0640]
> [EDT May  6 15:28:57] debug    : 'boot_sda1' uid check succeeded [current
> uid=0]
> [EDT May  6 15:28:57] debug    : 'boot_sda1' gid check succeeded [current
> gid=6]
> [EDT May  6 15:28:57] debug    : 'boot_sda1' inode usage check succeeded
> [current inode usage=0.1%]
> [EDT May  6 15:28:57] debug    : 'boot_sda1' inode usage check succeeded
> [current inode usage=0.1%]
> [EDT May  6 15:28:57] debug    : 'boot_sda1' space usage check succeeded
> [current space usage=24.2%]
> [EDT May  6 15:28:57] debug    : 'boot_sda1' space usage check succeeded
> [current space usage=24.2%]
> [EDT May  6 15:28:57] error    : 'datafs_sda2' unable to read filesystem
> /dev/sda2 state
> [EDT May  6 15:28:57] debug    : Data access error notification is sent to
> address@hidden
> [EDT May  6 15:28:58] debug    : 'rootfs_logical' space usage check
> succeeded [current space usage=35.6%]
> [EDT May  6 15:28:58] debug    : ICMP echo response 1/3 succeeded --
> received id=38340 sequence=0 response_time=0.000171s
> [EDT May  6 15:28:58] debug    : 'shade' icmp ping succeeded [response time
> 0.000s]
> [EDT May  6 15:28:58] debug    : 'shade' succeeded connecting to
> INET[xxx.xxx.xxx.xxx:22] via TCP
> [EDT May  6 15:28:58] debug    : 'shade' succeeded testing protocol [SSH] at
> INET[xxx.xxx.xxx.xxx:22] via TCP
> [EDT May  6 15:29:07] debug    : HttpRequest error: HTTP/1.0 401 You are not
> authorized to access monit. Either you supplied the wrong credentials (e.g.
> bad password), or your browser doesn't understand how to supply the
> credentials required
> [EDT May  6 15:29:09] debug    : HttpRequest error: HTTP/1.0 404 There is no
> service by that name
> [EDT May  6 15:29:13] debug    : HttpRequest error: HTTP/1.0 404 There is no
> service by that name
> 
> 
> 
> Martin Pala wrote:
>> 
>> Is the system virtual machine of some type (VPS, etc.?) or real/physical
>> machine? If it is virtual it is possible that the access is rejected based
>> on host OS restrictions. There can be also other access control
>> restrictions - for example if you use SElinux ... 
>> 
>> The svn repository contains development version of 5.2 in various
>> development stages (some features may be incomplete) and also the features
>> may not been tested yet - the exact codebase depends on when you updated
>> the source code. The problems which you have shouldn't be specific to
>> 5.2-development anyway as there were no changes which could exacerbate
>> like this, but it could be good to verify the behavior with official 5.1.1
>> version.
>> 
>> Please can you also run monit with debug enabled and provide full output?:
>> 
>> monit -vI
>> 
>> 
>> 
>> 
>> On May 4, 2010, at 4:05 PM, zachlac wrote:
>> 
>>> 
>>> sda2 cannot be monitored, while sda1 can:
>>> 
>>> # ls -l /dev/sda2
>>> brw-r----- 1 root disk 8, 2 Feb 19 14:22 /dev/sda2
>>> # ls -l /dev/sda1
>>> brw-r----- 1 root disk 8, 2 Feb 19 14:22 /dev/sda1
>>> 
>>> I'm using the repository version of monit, which is 5.2.
>>> 
>>> Thank you.
>>> 
>>> 
>>> Martin Pala wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> LVM shouldn't be problem, please can you provide output of "ls -l
>>>> /dev/sda2"? Which monit version do you use? There was problem in monit
>>>> <=
>>>> 4.10.1 when the device was symlink - the support for device symlinks was
>>>> added in Monit 5.0 (current version is Monit 5.1.1).
>>>> 
>>>> Optionally you can use mount point instead of device.
>>>> 
>>>> Regards,
>>>> Martin
>>>> 
>>>> 
>>>> On May 3, 2010, at 6:43 PM, zachlac wrote:
>>>> 
>>>>> 
>>>>> I have monit monitoring /dev/sdb1, /dev/sdb2, and /dev/sda1.  However,
>>>>> /dev/sda2 is a Linux LVM, and when I try to monitor it I get a "Data
>>>>> access
>>>>> error".  My output for fdisk is as follows:
>>>>> ----------------------------------------------------------------------------------------------
>>>>> isk /dev/sda: 200.0 GB, 200049647616 bytes
>>>>> 255 heads, 63 sectors/track, 24321 cylinders
>>>>> Units = cylinders of 16065 * 512 = 8225280 bytes
>>>>> 
>>>>> Device Boot      Start         End      Blocks   Id  System
>>>>> /dev/sda1   *           1          13      104391   83  Linux
>>>>> /dev/sda2              14       24321   195254010   8e  Linux LVM
>>>>> 
>>>>> Disk /dev/sdb: 200.0 GB, 200049647616 bytes
>>>>> 255 heads, 63 sectors/track, 24321 cylinders
>>>>> Units = cylinders of 16065 * 512 = 8225280 bytes
>>>>> 
>>>>> Device Boot      Start         End      Blocks   Id  System
>>>>> /dev/sdb1   *           1       12160    97675168+  83  Linux
>>>>> /dev/sdb2           12161       24321    97683232+  83  Linux
>>>>> --------------------------------------------------------------------------------------------
>>>>> 
>>>>> My monitrc contains the following important lines:
>>>>> ---------------------------------------------------------------------------------------------
>>>>> check filesystem boot_sda1 with path /dev/sda1
>>>>>  start program  = "/bin/mount /data"
>>>>>  stop program  = "/bin/umount /data"
>>>>>  if failed permission 640 then unmonitor
>>>>>  if failed uid root then unmonitor
>>>>>  if failed gid disk then unmonitor
>>>>>  if space usage > 80% for 5 times within 15 cycles then alert
>>>>>  if space usage > 99% then stop
>>>>> #    if inode usage > 30000 then alert
>>>>> #    if inode usage > 250000 then alert
>>>>>  if inode usage > 80% then alert
>>>>>  if inode usage > 99% then stop
>>>>>  group server
>>>>> 
>>>>> check filesystem datafs_sda2 with path /dev/sda2
>>>>>  start program  = "/bin/mount /data"
>>>>>  stop program  = "/bin/umount /data"
>>>>>  if failed permission 640 then unmonitor
>>>>>  if failed uid root then unmonitor
>>>>>  if failed gid disk then unmonitor
>>>>>  if space usage > 80% for 5 times within 15 cycles then alert
>>>>>  if space usage > 99% then stop
>>>>> #    if inode usage > 30000 then alert
>>>>> #    if inode usage > 250000 then alert
>>>>>  if inode usage > 80% then alert
>>>>>  if inode usage > 99% then stop
>>>>>  group server
>>>>> ---------------------------------------------------------------------------------------------
>>>>> 
>>>>> Why can't I monitor the LVM?
>>>>> 
>>>>> Thank you.
>>>>> -- 
>>>>> View this message in context:
>>>>> http://old.nabble.com/-monit--can%27t-monitor-one-of-my-filesystems-tp28437378p28437378.html
>>>>> Sent from the monit-general mailing list archive at Nabble.com.
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> To unsubscribe:
>>>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>> 
>>>> 
>>>> 
>>>> --
>>>> To unsubscribe:
>>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/-monit--can%27t-monitor-one-of-my-filesystems-tp28437378p28447734.html
>>> Sent from the monit-general mailing list archive at Nabble.com.
>>> 
>>> 
>>> 
>>> --
>>> To unsubscribe:
>>> http://lists.nongnu.org/mailman/listinfo/monit-general
>> 
>> 
>> 
>> --
>> To unsubscribe:
>> http://lists.nongnu.org/mailman/listinfo/monit-general
>> 
>> 
> 
> -- 
> View this message in context: 
> http://old.nabble.com/-monit--can%27t-monitor-one-of-my-filesystems-tp28437378p28478533.html
> Sent from the monit-general mailing list archive at Nabble.com.
> 
> 
> 
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general




reply via email to

[Prev in Thread] Current Thread [Next in Thread]