qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 2/3] replication: acquire aio context before calling job_c


From: Stefan Reiter
Subject: Re: [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync
Date: Thu, 2 Apr 2020 17:05:04 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0

On 02/04/2020 14:41, Max Reitz wrote:
On 01.04.20 10:15, Stefan Reiter wrote:
job_cancel_sync requires the job's lock to be held, all other callers
already do this (replication_stop, drive_backup_abort,
blockdev_backup_abort, job_cancel_sync_all, cancel_common).

I think all other callers come directly from QMP, though, so they have
no locks yet.  This OTOH is called from a block driver function, so I
would assume the BDS context is locked already (or rather, this is
executed in the BDS context).

I also think that the commit job runs in the same context.  So I would
assume that this would be a nested lock, which should be unnecessary and
might cause problems.  Maybe we should just assert that the job’s
context is the current context?

(Or would that still be problematic because now job_txn_apply() wants to
release some context, and that isn’t possible without this patch?  I
would hope it’s possible, because I think we shouldn’t have to acquire
the current context, and should be free to release it...?  I have no
idea.  Maybe we are actually free to reacquire the current context here.)


You're right, this seems to be unnecessary. Adding an

  assert(commit_job->aio_context == qemu_get_current_aio_context());

to make sure seems like a good idea though. bdrv_close appears to always have the lock on its driver's context held.

Signed-off-by: Stefan Reiter <address@hidden>
---
  block/replication.c | 8 +++++++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/replication.c b/block/replication.c
index 413d95407d..17ddc31569 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -144,12 +144,18 @@ fail:
  static void replication_close(BlockDriverState *bs)
  {
      BDRVReplicationState *s = bs->opaque;
+    Job *commit_job;
+    AioContext *commit_ctx;
if (s->stage == BLOCK_REPLICATION_RUNNING) {
          replication_stop(s->rs, false, NULL);
      }
      if (s->stage == BLOCK_REPLICATION_FAILOVER) {
-        job_cancel_sync(&s->commit_job->job);
+        commit_job = &s->commit_job->job;
+        commit_ctx = commit_job->aio_context;
+        aio_context_acquire(commit_ctx);
+        job_cancel_sync(commit_job);
+        aio_context_release(commit_ctx);

Anyway, there’s also the problem that I would guess the
job_cancel_sync() might move the job from its current context back into
the main context.  Then we’d release the wrong context here.
 > Max

      }
if (s->mode == REPLICATION_MODE_SECONDARY) {







reply via email to

[Prev in Thread] Current Thread [Next in Thread]