bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Absolute limits of rank 2 bool matrix size in GNU APL?


From: Kacper Gutowski
Subject: Re: Absolute limits of rank 2 bool matrix size in GNU APL?
Date: Wed, 29 Dec 2021 00:03:33 +0100

This is somewhat tangential but,

On Tue, Dec 28, 2021 at 01:25:19PM -0600, Blake McBride wrote:
Level 1:  you are using the RAM that exists (not over-committed)

Level 2: you are using more RAM than you have causing over-commit and paging.

Level 3:  you allocate more memory than exists in RAM + paging.

In the context of memory handling in linux, your "level 2" is not considered to be an overcommitment yet. When you have swap configured, this is a perfectly normal mode of operation and, depending on workload, it might not be problematic at all. Commit limit is calculated as the sum of all the swaps plus a configurable percentage of physical RAM.
The problem is that the commit limit is mostly ignored by default.


On Tue, Dec 28, 2021 at 1:16 PM Elias Mårtenson wrote:
Unfortunately, Linux malloc never returns NULL. Even if you try to allocate a petabyte in one allocation. You can write to this memory and new pages will be created as you write, and at some point your write will fail with a SEGV because there are no free page left.

You can try it right now. Write a C program that allocates a few TB of RAM and see what happens. Actually, nothing will happen until you start writing to this memory.

Minor nit, but it's not never. With vm.overcommit_memory set to 0 (default heuristic) it will still return NULL for obviously oversized requests (more than the commit limit in one go; allocating "few TB" on my machine with 8G fails early).

But, of course, what is important is that in this default configuration it doesn't actually reserve the requested amount of memory and later a page fault might end up being fatal when one tries to access it.

Linux has a MAP_POPULATE flag that pre-faults mmaped memory, and my understanding is that if it succeeds, then it should be safe to use later, but it's no different from trying to touch all the pages returned from malloc--it just gets you killed early before the return from mmap.

What's even worse is that once you start to fail, the kernel will randomly kill processes until it gets more free pages. If that sounds idiotic, that's because it is, but that's how it works.

It's not random. The scoring systems looks silly, but it works really well (at least without swap), usually killing exactly what needs to be killed. (In essence doing exactly what Blake says he would do manually.) In my experience, if you can't set vm.overcommit_memory to 2 (and you can't because of browsers etc.), then OOM killer is close to the best of what can be done to manage it.

You can configure the kernel to use not do this, but unfortunately you're going to have other problems if you do, primarily because all Linux software is written with the default behaviour in mind.

Sad truth.

-k



reply via email to

[Prev in Thread] Current Thread [Next in Thread]