Re: [PATCH 3/3] block-backend: delay application of request queuing

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] block-backend: delay application of request queuing

From:	Hanna Czenczek
Subject:	Re: [PATCH 3/3] block-backend: delay application of request queuing
Date:	Wed, 12 Apr 2023 16:33:38 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1

On 12.04.23 14:03, Paolo Bonzini wrote:

On Wed, Apr 12, 2023 at 1:54 PM Hanna Czenczek <hreitz@redhat.com> wrote:

On 05.04.23 18:31, Paolo Bonzini wrote:

+    if (busy || blk->in_flight) {
+        return true;
+    }
+
+    if (qatomic_read(&blk->request_queuing) == BLK_QUEUE_READY) {
+        qatomic_set(&blk->request_queuing, BLK_QUEUE_QUIESCENT);
+    }
+    return false;
   }

This implicitly relies on nobody increasing blk->in_flight (or
dev_ops->drained_poll() returning `true` again) while the BB is starting
to be drained, because if the function were to be called again after it
has returned `false` once per drained section (not sure if that’s
possible![1]), then we’d end up in the original situation, with
in_flight elevated and queuing enabled.

Yes, it does.

Is that really strictly guaranteed somehow or is it rather a complex
conglomerate of many cases that in the end happen to work out
individually?  I mean, I could imagine that running
BlockDevOps.drained_begin() is supposed to guarantee that, but it can’t,
because only NBD seems to implement it.  The commit message talks about
IDE being fine (by accident?) because it needs BQL availability to
submit new requests.  But that’s very complex and I’d rather have a
strict requirement to guarantee correctness.

It's a conglomerate of three cases each of which is sufficient (BQL,
aio_disable_external, bdrv_drained_begin---plus just not using
blk_inc_in_flight could be a fourth, of course). Of these,
aio_disable_external() is going away in favor of the
.bdrv_drained_begin callback; and blk_inc_in_flight() is used rarely
in the first place so I thought it'd be not too hard to have this
requirement.

Does IDE’s BQL requirement work for nested drains, though, i.e. when youhave a drained_begin, followed by another? The commit message doesn’tsay whether it’s impossible for IDE to create a new request in betweenthe two.

I’m a bit afraid that these cases are too complicated for me to fullycomprehend.

[1] If the blk_root_drained_poll() isn’t called anymore after returning
`false`, all will be good, but I assume it will be, because we have a
quiesce_counter, not a quiesce_bool.  We could kind of emulate this by
continuing to return `false` after blk_root_drained_poll() has returned
`false` once, until the quiesce_counter becomes 0.
We could also have blk_root_drained_poll(), if it sees in_flight > 0 &&
request_queuing == BLK_QUEUE_QUIESCENT, revert request_queuing to
BLK_QUEUE_READY and resume all queued requests.

The intended long term fix is to remove request queuing and, if a
request is submitted while BLK_QUEUE_QUIESCENT, give an assertion
failure.


Yep, that would be a nice obvious requirement.

But since the hang requires blk_inc_in_flight() in the device, perhaps
in the short term documenting it in blk_inc_in_flight() may be enough?

Technically it needs a blk_inc_in_flight() whose blk_dec_in_flight()depends on a different request that can be queued (which is only thecase in IDE), so I suppose we could document exactly that in thosefunctions’ interfaces, i.e. that users must take care not to useblk_inc_flight() while the BlockBackend is (being) drained, when theassociated blk_dec_in_flight() may depend on an I/O request to the BB.

I think that should be enough, yes. Well, as long as you can guaranteethat IDE will indeed fulfill that requirement, because I find itdifficult to see/prove...


Hanna

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 0/3] block-backend: avoid deadlocks due to early queuing of request, Paolo Bonzini, 2023/04/05
- [PATCH 1/3] aio-posix: disable polling after aio_disable_external(), Paolo Bonzini, 2023/04/05
- [PATCH 2/3] block-backend: make global properties write-once, Paolo Bonzini, 2023/04/05
- [PATCH] block-backend: delay application of request queuing, Paolo Bonzini, 2023/04/05
- [PATCH 3/3] block-backend: delay application of request queuing, Paolo Bonzini, 2023/04/05
  - Re: [PATCH 3/3] block-backend: delay application of request queuing, Hanna Czenczek, 2023/04/12
    - Re: [PATCH 3/3] block-backend: delay application of request queuing, Paolo Bonzini, 2023/04/12
    - Re: [PATCH 3/3] block-backend: delay application of request queuing, Hanna Czenczek <=

Prev by Date: Re: source fails to compile on msys2
Next by Date: Re: [PULL v4 23/83] hw/cxl/cdat: CXL CDAT Data Object Exchange implementation
Previous by thread: Re: [PATCH 3/3] block-backend: delay application of request queuing
Next by thread: [RFC PATCH 0/2] accel/kvm: Extract 'sysemu/kvm_irq.h' from 'sysemu/kvm.h'
Index(es):
- Date
- Thread