[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH v4 1/3] job: take each job's lock individually in job_txn_apply
From: |
Stefan Reiter |
Subject: |
[PATCH v4 1/3] job: take each job's lock individually in job_txn_apply |
Date: |
Wed, 1 Apr 2020 10:15:02 +0200 |
All callers of job_txn_apply hold a single job's lock, but different
jobs within a transaction can have different contexts, thus we need to
lock each one individually before applying the callback function.
Similar to job_completed_txn_abort this also requires releasing the
caller's context before and reacquiring it after to avoid recursive
locks which might break AIO_WAIT_WHILE in the callback.
This also brings to light a different issue: When a callback function in
job_txn_apply moves it's job to a different AIO context, job_exit will
try to release the wrong lock (now that we re-acquire the lock
correctly, previously it would just continue with the old lock, leaving
the job unlocked for the rest of the codepath back to job_exit). Fix
this by not caching the job's context in job_exit and add a comment
about why this is done.
One test needed adapting, since it calls job_finalize directly, so it
manually needs to acquire the correct context.
Signed-off-by: Stefan Reiter <address@hidden>
---
job.c | 48 ++++++++++++++++++++++++++++++++++---------
tests/test-blockjob.c | 2 ++
2 files changed, 40 insertions(+), 10 deletions(-)
diff --git a/job.c b/job.c
index 134a07b92e..5fbaaabf78 100644
--- a/job.c
+++ b/job.c
@@ -136,17 +136,36 @@ static void job_txn_del_job(Job *job)
}
}
-static int job_txn_apply(JobTxn *txn, int fn(Job *))
+static int job_txn_apply(Job *job, int fn(Job *))
{
- Job *job, *next;
+ AioContext *inner_ctx;
+ Job *other_job, *next;
+ JobTxn *txn = job->txn;
int rc = 0;
- QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
- rc = fn(job);
+ /*
+ * Similar to job_completed_txn_abort, we take each job's lock before
+ * applying fn, but since we assume that outer_ctx is held by the caller,
+ * we need to release it here to avoid holding the lock twice - which would
+ * break AIO_WAIT_WHILE from within fn.
+ */
+ aio_context_release(job->aio_context);
+
+ QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+ inner_ctx = other_job->aio_context;
+ aio_context_acquire(inner_ctx);
+ rc = fn(other_job);
+ aio_context_release(inner_ctx);
if (rc) {
break;
}
}
+
+ /*
+ * Note that job->aio_context might have been changed by calling fn, so we
+ * can't use a local variable to cache it.
+ */
+ aio_context_acquire(job->aio_context);
return rc;
}
@@ -774,11 +793,11 @@ static void job_do_finalize(Job *job)
assert(job && job->txn);
/* prepare the transaction to complete */
- rc = job_txn_apply(job->txn, job_prepare);
+ rc = job_txn_apply(job, job_prepare);
if (rc) {
job_completed_txn_abort(job);
} else {
- job_txn_apply(job->txn, job_finalize_single);
+ job_txn_apply(job, job_finalize_single);
}
}
@@ -824,10 +843,10 @@ static void job_completed_txn_success(Job *job)
assert(other_job->ret == 0);
}
- job_txn_apply(txn, job_transition_to_pending);
+ job_txn_apply(job, job_transition_to_pending);
/* If no jobs need manual finalization, automatically do so */
- if (job_txn_apply(txn, job_needs_finalize) == 0) {
+ if (job_txn_apply(job, job_needs_finalize) == 0) {
job_do_finalize(job);
}
}
@@ -849,9 +868,10 @@ static void job_completed(Job *job)
static void job_exit(void *opaque)
{
Job *job = (Job *)opaque;
- AioContext *ctx = job->aio_context;
+ AioContext *ctx;
- aio_context_acquire(ctx);
+ job_ref(job);
+ aio_context_acquire(job->aio_context);
/* This is a lie, we're not quiescent, but still doing the completion
* callbacks. However, completion callbacks tend to involve operations that
@@ -862,6 +882,14 @@ static void job_exit(void *opaque)
job_completed(job);
+ /*
+ * Note that calling job_completed can move the job to a different
+ * aio_context, so we cannot cache from above. job_txn_apply takes care of
+ * acquiring the new lock, and we ref/unref to avoid job_completed freeing
+ * the job underneath us.
+ */
+ ctx = job->aio_context;
+ job_unref(job);
aio_context_release(ctx);
}
diff --git a/tests/test-blockjob.c b/tests/test-blockjob.c
index 4eeb184caf..7519847912 100644
--- a/tests/test-blockjob.c
+++ b/tests/test-blockjob.c
@@ -367,7 +367,9 @@ static void test_cancel_concluded(void)
aio_poll(qemu_get_aio_context(), true);
assert(job->status == JOB_STATUS_PENDING);
+ aio_context_acquire(job->aio_context);
job_finalize(job, &error_abort);
+ aio_context_release(job->aio_context);
assert(job->status == JOB_STATUS_CONCLUDED);
cancel_common(s);
--
2.26.0