[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: More intelligent swap monitoring?
From: |
Lutz Mader |
Subject: |
Re: More intelligent swap monitoring? |
Date: |
Fri, 2 May 2025 17:44:38 +0200 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 |
Hello Jamie,
this is a simple tgz Package, feel free to unpack the package and copy
the bin/monit file to a proper place. But keep in mind, this is a test
package only, based on Monit 5.35.0 for Linux x86_64.
> That's brilliant - thank you very much.
The official fix will became available with 5.36.0, maybe.
The data should be similar to the data from "vmstat -s" and "vmstat 30"
(see your monitrc file, option "set daemon 30").
Keep in mind,
this is for testing/validation purpose only,
Lutz
Am 02.05.25 um 15:14 schrieb Jamie Burchell via This is the general
mailing list for monit:
> Hi Lutz
>
> That's brilliant - thank you very much.
>
> I'm currently using the version 5.33.0 from EPEL (Rocky Linux 9). How should
> I replace/install the test package?
>
> Thanks in advance
> Jamie
>
> -----Original Message-----
> From: monit-general-bounces+jamie=ib3.uk@nongnu.org
> <monit-general-bounces+jamie=ib3.uk@nongnu.org> On Behalf Of Lutz Mader
> Sent: 02 May 2025 02:11
> To: This is the general mailing list for monit <monit-general@nongnu.org>
> Subject: Re: More intelligent swap monitoring?
>
> Sorry Jamie, I'm late.
>
> Based on your suggestion I add a new test to "check system".
>
> check system $HOST
> # if memory usage > 75% then alert
> # if swap usage > 25% then alert
> if pagein > 10 pages then alert
> if pageout > 20 pages then alert
> if pagefault > 50 pages then alert
>
> A test package is available from
> https://bitbucket.org/lutzmad/monit/downloads/monit-vmstat-suse12-x64.tar.gz
>
> Let me know, if this will fix you problem,
> Lutz
>
> Appendage:
> ~/bin/monit status slesbuild
> Monit 5.35.0 uptime: 20m
>
> System 'slesbuild'
> status OK
> monitoring status Monitored
> monitoring mode active
> on reboot start
> load average [0.00] [0.00] [0.05]
> cpu 0.4%usr 3.3%sys 0.0%nice 2.1%iowait
> 0.0%hardirq 0.0%softirq 0.0%steal 0.0%guest 0.0%guestnice
> memory usage 492.2 MB [27.4%]
> swap usage 8.0 MB [0.4%]
> pagein count 0 [58]
> pageout count 0 [2050]
> uptime 5h 0m
> boot time Thu, 01 May 2025 21:02:12
> filedescriptors 3264 [1.8% of 180992 limit]
> data collected Fri, 02 May 2025 02:03:04
>
>
> Am 17.01.25 um 11:51 schrieb Jamie Burchell via This is the general
> mailing list for monit:
>> Hello
>>
>> I have reduced the amount of memory one of the services was consuming,
>> which
>> has abated the problem for now. However, there's still some swap being
>> used
>> so perhaps in a few days time the problem will come up again.
>>
>> Here's the output of /proc/meminfo as requested
>>
>> MemTotal: 7868472 kB
>> MemFree: 671568 kB
>> MemAvailable: 3261880 kB
>> Buffers: 0 kB
>> Cached: 893768 kB
>> SwapCached: 47448 kB
>> Active: 2661256 kB
>> Inactive: 1320396 kB
>> Active(anon): 2400964 kB
>> Inactive(anon): 800724 kB
>> Active(file): 260292 kB
>> Inactive(file): 519672 kB
>> Unevictable: 3072 kB
>> Mlocked: 0 kB
>> SwapTotal: 4194300 kB
>> SwapFree: 3923228 kB
>> Zswap: 0 kB
>> Zswapped: 0 kB
>> Dirty: 32 kB
>> Writeback: 0 kB
>> AnonPages: 3003564 kB
>> Mapped: 169140 kB
>> Shmem: 113804 kB
>> KReclaimable: 2124156 kB
>> Slab: 2540272 kB
>> SReclaimable: 2124156 kB
>> SUnreclaim: 416116 kB
>> KernelStack: 13712 kB
>> PageTables: 77444 kB
>> SecPageTables: 0 kB
>> NFS_Unstable: 0 kB
>> Bounce: 0 kB
>> WritebackTmp: 0 kB
>> CommitLimit: 8128536 kB
>> Committed_AS: 12257916 kB
>> VmallocTotal: 34359738367 kB
>> VmallocUsed: 31016 kB
>> VmallocChunk: 0 kB
>> Percpu: 1840 kB
>> HardwareCorrupted: 0 kB
>> AnonHugePages: 1378304 kB
>> ShmemHugePages: 0 kB
>> ShmemPmdMapped: 0 kB
>> FileHugePages: 0 kB
>> FilePmdMapped: 0 kB
>> CmaTotal: 0 kB
>> CmaFree: 0 kB
>> Unaccepted: 0 kB
>> HugePages_Total: 0
>> HugePages_Free: 0
>> HugePages_Rsvd: 0
>> HugePages_Surp: 0
>> Hugepagesize: 2048 kB
>> Hugetlb: 0 kB
>> DirectMap4k: 118624 kB
>> DirectMap2M: 8269824 kB
>>
>> Regards
>> Jamie
>>
>>
>> --
>>
>>
>> -----Original Message-----
>> From: monit-general-bounces+jamie=ib3.co.uk@nongnu.org
>> <monit-general-bounces+jamie=ib3.co.uk@nongnu.org> On Behalf Of Lutz Mader
>> Sent: 15 January 2025 20:37
>> To: This is the general mailing list for monit <monit-general@nongnu.org>
>> Subject: Re: More intelligent swap monitoring?
>>
>> Hello,
>> I have no useful examples, vmstat swap usage is si=0 and so=0 only.
>> The values calculated based on /proc/meminfo fit, on a Linux system.
>>
>> And the system status information seems to be useful.
>>
>>> Is it possible to configure Monit to alert of actual swapping
>>> out rather than swap file usage, or am I barking up the wrong tree?
>>
>> You are right, monit does not show the actual swap file IO (page in/out
>> data), the data based on the usage.
>>
>> monit status LINUX
>> Monit 5.34.0 uptime: 49d 0h 43m
>>
>> System 'LINUX'
>> status OK
>> monitoring status Monitored
>> monitoring mode active
>> on reboot start
>> load average [7.30] [8.60] [12.77]
>> cpu 0.8%usr 0.3%sys 14.7%nice 0.0%iowait
>> 0.0%hardirq 0.0%softirq 0.0%steal 0.0%guest 0.0%guestnice
>> memory usage 42.6 GB [11.3%]
>> swap usage 10.4 MB [0.5%]
>> uptime 61d 19h 43m
>> boot time Thu, 14 Nov 2024 14:54:46
>> filedescriptors 16800 [0.2% of 6815744 limit]
>> data collected Wed, 15 Jan 2025 10:37:52
>>
>> The vmstat data.
>>
>> Swap
>> si: Amount of memory swapped in from disk (/s).
>> so: Amount of memory swapped to disk (/s).
>>
>> vmstat
>> procs -----------memory---------- ---swap-- -----io---- -system--
>> ------cpu-----
>> r b swpd free buff cache si so bi bo in cs us sy
>> id wa st
>> 7 0 10624 273713168 811688 87684144 0 0 69 40 0 0 11
>> 1 88 0 0
>>
>> vmstat -s
>> 395130516 K total memory
>> 120558732 K used memory
>> 90364732 K active memory
>> 11131200 K inactive memory
>> 274571784 K free memory
>> 811824 K buffer memory
>> 88984004 K swap cache
>> 2095100 K total swap
>> 10624 K used swap
>> 2084476 K free swap
>> 1313872869 non-nice user cpu ticks
>> 1427185498 nice user cpu ticks
>> 247910673 system cpu ticks
>> 22633904247 idle cpu ticks
>> 6802373 IO-wait cpu ticks
>> 0 IRQ cpu ticks
>> 2608560 softirq cpu ticks
>> 0 stolen cpu ticks
>> 17687866581 pages paged in
>> 10300491953 pages paged out
>> 1345 pages swapped in
>> 6655 pages swapped out
>> 3444460676 interrupts
>> 416044891 CPU context switches
>> 1731592487 boot time
>> 273618695 forks
>>
>> The values based on /proc/meminfo.
>>
>> cat /proc/meminfo
>> MemTotal: 395130516 kB
>> MemFree: 275604548 kB
>> MemAvailable: 353323016 kB
>> Buffers: 811800 kB
>> Cached: 85849340 kB
>> SwapCached: 892 kB
>> Active: 89256636 kB
>> Inactive: 11124384 kB
>> Active(anon): 18733260 kB
>> Inactive(anon): 3398376 kB
>> Active(file): 70523376 kB
>> Inactive(file): 7726008 kB
>> Unevictable: 975348 kB
>> Mlocked: 975348 kB
>> SwapTotal: 2095100 kB
>> SwapFree: 2084476 kB
>> Dirty: 468 kB
>> Writeback: 0 kB
>> AnonPages: 14695184 kB
>> Mapped: 2965708 kB
>> Shmem: 8423756 kB
>> Slab: 8029800 kB
>> SReclaimable: 2699580 kB
>> SUnreclaim: 5330220 kB
>> KernelStack: 70320 kB
>> PageTables: 421232 kB
>> NFS_Unstable: 0 kB
>> Bounce: 0 kB
>> WritebackTmp: 0 kB
>> CommitLimit: 199660356 kB
>> Committed_AS: 26151808 kB
>> VmallocTotal: 34359738367 kB
>> VmallocUsed: 0 kB
>> VmallocChunk: 0 kB
>> HardwareCorrupted: 0 kB
>> AnonHugePages: 0 kB
>> ShmemHugePages: 0 kB
>> ShmemPmdMapped: 0 kB
>> HugePages_Total: 0
>> HugePages_Free: 0
>> HugePages_Rsvd: 0
>> HugePages_Surp: 0
>> Hugepagesize: 2048 kB
>> DirectMap4k: 37410524 kB
>> DirectMap2M: 286273536 kB
>> DirectMap1G: 80740352 kB
>>
>> The values are used to calculate the monit swap data.
>>
>> in src/process/sysdep_LINUX.c
>>
>> used_system_memory_sysdep(SystemInfo_T *si)
>>
>> // Swap
>> if (! (ptr = strstr(buf, "SwapTotal:")) || sscanf(ptr + 10,
>> "%llu", &swap_total) != 1) {
>> Log_error("system statistic error -- cannot get swap
>> total amount\n");
>> goto error;
>> }
>> if (! (ptr = strstr(buf, "SwapFree:")) || sscanf(ptr + 9,
>> "%llu", &swap_free) != 1) {
>> Log_error("system statistic error -- cannot get swap
>> free amount\n");
>> goto error;
>> }
>> si->swap.size = swap_total * 1024;
>> si->swap.usage.bytes = (swap_total - swap_free) * 1024;
>>
>> The question is,
>> how does the /proc/meminfo output data look like on your system.
>>
>> Are some examples available, based on vmstat and the /proc/meminfo data.
>>
>> Lutz
>>
>>
>> Am 14.01.25 um 12:10 schrieb Jamie Burchell via This is the general
>> mailing list for monit:
>>> Hi
>>>
>>>
>>>
>>> I currently use Monit to alert me if swap usage is over > 20%. This works
>>> most of the time, but I have a particularly stubborn VM currently which
>>> appears to like to add data to swap and then not touch it. Using vmstat
>>> shows there are either no, or maybe the odd non-zero swap in operation
>>> and
>>> no swap outs. Is it possible to configure Monit to alert of actual
>>> swapping
>>> out rather than swap file usage, or am I barking up the wrong tree?
>>>
>>>
>>>
>>> Thanks in advance!
>>>
>>> Jamie
>>>
>>
>>
>
>