[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[RFC PATCH 4/5] child_job_drained_poll: override polling condition only
From: |
Emanuele Giuseppe Esposito |
Subject: |
[RFC PATCH 4/5] child_job_drained_poll: override polling condition only when in home thread |
Date: |
Tue, 1 Mar 2022 09:21:12 -0500 |
drv->drained_poll() is only implemented in mirror, and allows
it to drain from the coroutine. The mirror implementation uses
in_drain flag to recognize when it is draining from coroutine,
and consequently avoid deadlocking (wait the poll condition in
child_job_drained_poll to wait for itself).
The problem is that this flag is dangerous, because it breaks
bdrv_drained_begin() invariants: once drained_begin ends, all
jobs, in_flight requests, and anything running in the iothread
are blocked.
This can be broken in such way:
iothread(mirror): s->in_drain = true; // mirror.c:1112
main loop: bdrv_drained_begin(mirror_bs);
/*
* drained_begin wait for bdrv_drain_poll_top_level() condition,
* that translates in child_job_drained_poll() for jobs, but
* mirror implements drv->drained_poll() so it returns
* !!in_flight_requests, which his 0 (assertion in mirror.c:1105).
*/
main loop: thinks iothread is stopped and is modifying the graph...
iothread(mirror): *continues*, as nothing is stopping it
iothread(mirror): bdrv_drained_begin(bs);
/* draining reads the graph while it is modified!! */
main loop: done modifying the graph...
In order to fix this, we can simply allow drv->drained_poll()
to be called only by the iothread, and not the main loop.
We distinguish it by using in_aio_context_home_thread(), that
returns false if @ctx is not the same as the thread that runs it.
Co-Developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
blockjob.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/blockjob.c b/blockjob.c
index 10815a89fe..e132d9587e 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -107,6 +107,7 @@ static bool child_job_drained_poll(BdrvChild *c)
BlockJob *bjob = c->opaque;
Job *job = &bjob->job;
const BlockJobDriver *drv = block_job_driver(bjob);
+ AioContext *ctx = block_job_get_aio_context(bjob);
/* An inactive or completed job doesn't have any pending requests. Jobs
* with !job->busy are either already paused or have a pause point after
@@ -117,7 +118,7 @@ static bool child_job_drained_poll(BdrvChild *c)
/* Otherwise, assume that it isn't fully stopped yet, but allow the job to
* override this assumption. */
- if (drv->drained_poll) {
+ if (in_aio_context_home_thread(ctx) && drv->drained_poll) {
return drv->drained_poll(bjob);
} else {
return true;
--
2.31.1
- [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept, Emanuele Giuseppe Esposito, 2022/03/01
- [RFC PATCH 2/5] introduce BDRV_POLL_WHILE_UNLOCKED, Emanuele Giuseppe Esposito, 2022/03/01
- [RFC PATCH 4/5] child_job_drained_poll: override polling condition only when in home thread,
Emanuele Giuseppe Esposito <=
- [RFC PATCH 5/5] test-bdrv-drain: ensure draining from main loop stops iothreads, Emanuele Giuseppe Esposito, 2022/03/01
- [RFC PATCH 3/5] block/io.c: introduce bdrv_subtree_drained_{begin/end}_unlocked, Emanuele Giuseppe Esposito, 2022/03/01
- [RFC PATCH 1/5] aio-wait.h: introduce AIO_WAIT_WHILE_UNLOCKED, Emanuele Giuseppe Esposito, 2022/03/01
- Re: [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept, Emanuele Giuseppe Esposito, 2022/03/01
- Re: [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept, Stefan Hajnoczi, 2022/03/02