Re: [PATCH v4 2/3] replication: acquire aio context before calling job

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 2/3] replication: acquire aio context before calling job_c

From:	Stefan Reiter
Subject:	Re: [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync
Date:	Thu, 2 Apr 2020 17:05:04 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0

On 02/04/2020 14:41, Max Reitz wrote:

On 01.04.20 10:15, Stefan Reiter wrote:

job_cancel_sync requires the job's lock to be held, all other callers
already do this (replication_stop, drive_backup_abort,
blockdev_backup_abort, job_cancel_sync_all, cancel_common).


I think all other callers come directly from QMP, though, so they have
no locks yet.  This OTOH is called from a block driver function, so I
would assume the BDS context is locked already (or rather, this is
executed in the BDS context).

I also think that the commit job runs in the same context.  So I would
assume that this would be a nested lock, which should be unnecessary and
might cause problems.  Maybe we should just assert that the job’s
context is the current context?

(Or would that still be problematic because now job_txn_apply() wants to
release some context, and that isn’t possible without this patch?  I
would hope it’s possible, because I think we shouldn’t have to acquire
the current context, and should be free to release it...?  I have no
idea.  Maybe we are actually free to reacquire the current context here.)


You're right, this seems to be unnecessary. Adding an

  assert(commit_job->aio_context == qemu_get_current_aio_context());

to make sure seems like a good idea though. bdrv_close appears to alwayshave the lock on its driver's context held.

Signed-off-by: Stefan Reiter <address@hidden>
---
  block/replication.c | 8 +++++++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/replication.c b/block/replication.c
index 413d95407d..17ddc31569 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -144,12 +144,18 @@ fail:
  static void replication_close(BlockDriverState *bs)
  {
      BDRVReplicationState *s = bs->opaque;
+    Job *commit_job;
+    AioContext *commit_ctx;

if (s->stage == BLOCK_REPLICATION_RUNNING) {

          replication_stop(s->rs, false, NULL);
      }
      if (s->stage == BLOCK_REPLICATION_FAILOVER) {
-        job_cancel_sync(&s->commit_job->job);
+        commit_job = &s->commit_job->job;
+        commit_ctx = commit_job->aio_context;
+        aio_context_acquire(commit_ctx);
+        job_cancel_sync(commit_job);
+        aio_context_release(commit_ctx);


Anyway, there’s also the problem that I would guess the
job_cancel_sync() might move the job from its current context back into
the main context.  Then we’d release the wrong context here.
 > Max

      }
if (s->mode == REPLICATION_MODE_SECONDARY) {

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v4 0/3] Fix some AIO context locking in jobs, Stefan Reiter, 2020/04/01
- [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync, Stefan Reiter, 2020/04/01
  - Re: [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync, Max Reitz, 2020/04/02
    - Re: [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync, Stefan Reiter <=
- [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean, Stefan Reiter, 2020/04/01
  - Re: [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean, Max Reitz, 2020/04/02
- [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply, Stefan Reiter, 2020/04/01
  - Re: [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply, Max Reitz, 2020/04/02
    - Re: [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply, Stefan Reiter, 2020/04/02
- Re: [PATCH v4 0/3] Fix some AIO context locking in jobs, Kevin Wolf, 2020/04/02
  - Re: [PATCH v4 0/3] Fix some AIO context locking in jobs, John Snow, 2020/04/06

Prev by Date: Re: [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply
Next by Date: Re: Questionable aspects of QEMU Error's design
Previous by thread: Re: [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync
Next by thread: [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean
Index(es):
- Date
- Thread