qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 0/3] block-backend: avoid deadlocks due to early queuing of reque


From: Paolo Bonzini
Subject: [PATCH 0/3] block-backend: avoid deadlocks due to early queuing of request
Date: Wed, 5 Apr 2023 18:17:49 +0200

IDE TRIM is a BB user that wants to elevate its BB's in-flight counter
for a "macro" operation that consists of several actual I/O operations.
Each of those operations is individually started and awaited.  It does
this so that blk_drain() will drain the whole TRIM, and not just a
single one of the many discard operations it may encompass.

When request queuing is enabled, this leads to a deadlock: The currently
ongoing discard is drained, and the next one is queued, waiting for the
drain to stop.  Meanwhile, TRIM still keeps the in-flight counter
elevated, waiting for all discards to stop -- which will never happen,
because with the in-flight counter elevated, the BB is never considered
drained, so the drained section does not begin and cannot end.

Draining has two purposes, granting exclusive access to a BlockDriverState
and waiting for all previous requests to complete.  Request queuing was
introduced mostly to ensure exclusive access to the BlockDriverState.
However, the implementation is stricter: it prevents new requests from
being submitted to the BlockDriverState, not allowing them to start
instead of just letting them complete before bdrv_drained_begin() returns.

The reason for this was to ensure progress and avoid a livelock
in blk_drain(), blk_drain_all_begin(), bdrv_drained_begin() or
bdrv_drain_all_begin(), if there is an endless stream of requests to
a BlockBackend.  However, as proved by the IDE TRIM testcase, the current
implementation of request queuing is prone to deadlocks and hard to fix
in different ways---even though Hanna tried, all of her attempts were
unsatisfactory one way or the other.

As the name suggests, deadlocks are worse than livelocks :) so let's
avoid them: turn the request queuing on only after the BlockBackend has
quiesced, and leave the second functionality of bdrv_drained_begin()
to the BQL or to the individual BlockDevOps implementations.

And in fact, this is not really a problem.  Of the various users of
BlockBackend, all of them can avoid the livelock:

- for a device that runs in the vCPU thread, requests will only be
submitted while holding the big QEMU lock, meaning they _won't_ be
submitted during bdrv_drained_begin() or bdrv_drain_all_begin().

- for anything that is blocked by aio_disable_external(), the iothread
will not be woken up.  There is still the case of polling, which has
to be disabled with patch 1.  This is slightly hackish but anyway
aio_disable_external() is going away, meaning that these cases will
fall under the third bucket...

- ... i.e. BlockBackends that can use a .drained_begin callback in
their BlockDevOps to temporarily stop I/O submissions.  Note that this
callback is not _absolutely_ necessary, in particular it is not needed
for safety because the patches do not get away with request queuing.

In the end, request queuing should indeed be unnecessary if .drained_begin
is implemented properly in all BlockDevOps.  It should be possible to warn
if a request come at the wrong time.  However, this is left for later.

Paolo


Based-on: <20230405101634.10537-1-pbonzini@redhat.com>


Paolo Bonzini (3):
  aio-posix: disable polling after aio_disable_external()
  block-backend: make global properties write-once
  block-backend: delay application of request queuing

 block/block-backend.c             | 61 +++++++++++++++++++++----------
 block/commit.c                    |  4 +-
 block/export/export.c             |  2 +-
 block/mirror.c                    |  4 +-
 block/parallels.c                 |  2 +-
 block/qcow.c                      |  2 +-
 block/qcow2.c                     |  2 +-
 block/qed.c                       |  2 +-
 block/stream.c                    |  4 +-
 block/vdi.c                       |  2 +-
 block/vhdx.c                      |  2 +-
 block/vmdk.c                      |  4 +-
 block/vpc.c                       |  2 +-
 include/sysemu/block-backend-io.h |  6 +--
 nbd/server.c                      |  3 +-
 tests/unit/test-bdrv-drain.c      |  4 +-
 tests/unit/test-block-iothread.c  |  2 +-
 util/aio-posix.c                  |  2 +-
 18 files changed, 66 insertions(+), 44 deletions(-)

-- 
2.39.2




reply via email to

[Prev in Thread] Current Thread [Next in Thread]