[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] coroutine: Revert to constant batch size

From: 成川 弘樹
Subject: Re: [PATCH 2/2] coroutine: Revert to constant batch size
Date: Fri, 13 May 2022 10:18:24 +0900
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0

I'm not sure how much testing is expected for "Tested-by".

If just checking my perspective is enough, yes. But I did not check that this patch fixes the problem of excessive resource consumption.

On 2022/05/12 21:50, Philippe Mathieu-Daudé wrote:
Hi Hiroki,

On Thu, May 12, 2022 at 8:57 AM 成川 弘樹 <hnarukaw@yahoo-corp.jp> wrote:

Thank you for your fix.

I confirmed that after applying this patch, my intended performance
improvement by 4c41c69e is still kept in our environment.

Is that equivalent to a formal
Tested-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>

On 2022/05/11 0:10, Kevin Wolf wrote:
Commit 4c41c69e changed the way the coroutine pool is sized because for
virtio-blk devices with a large queue size and heavy I/O, it was just
too small and caused coroutines to be deleted and reallocated soon
afterwards. The change made the size dynamic based on the number of
queues and the queue size of virtio-blk devices.

There are two important numbers here: Slightly simplified, when a
coroutine terminates, it is generally stored in the global release pool
up to a certain pool size, and if the pool is full, it is freed.
Conversely, when allocating a new coroutine, the coroutines in the
release pool are reused if the pool already has reached a certain
minimum size (the batch size), otherwise we allocate new coroutines.

The problem after commit 4c41c69e is that it not only increases the
maximum pool size (which is the intended effect), but also the batch
size for reusing coroutines (which is a bug). It means that in cases
with many devices and/or a large queue size (which defaults to the
number of vcpus for virtio-blk-pci), many thousand coroutines could be
sitting in the release pool without being reused.

This is not only a waste of memory and allocations, but it actually
makes the QEMU process likely to hit the vm.max_map_count limit on Linux
because each coroutine requires two mappings (its stack and the guard
page for the stack), causing it to abort() in qemu_alloc_stack() because
when the limit is hit, mprotect() starts to fail with ENOMEM.

In order to fix the problem, change the batch size back to 64 to avoid
uselessly accumulating coroutines in the release pool, but keep the
dynamic maximum pool size so that coroutines aren't freed too early
in heavy I/O scenarios.

Note that this fix doesn't strictly make it impossible to hit the limit,
but this would only happen if most of the coroutines are actually in use
at the same time, not just sitting in a pool. This is the same behaviour
as we already had before commit 4c41c69e. Fully preventing this would
require allowing qemu_coroutine_create() to return an error, but it
doesn't seem to be a scenario that people hit in practice.

Cc: qemu-stable@nongnu.org
Fixes: 4c41c69e05fe28c0f95f8abd2ebf407e95a4f04b
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]