qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH] job: Fix nested aio_poll() hanging


From: Eric Blake
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH] job: Fix nested aio_poll() hanging in job_txn_apply
Date: Wed, 22 Aug 2018 08:42:11 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 08/22/2018 12:48 AM, Fam Zheng wrote:
However, I was unable to quickly audit whether all callers really did have
the lock (it balloons into whether all callers of job_finalize() have the
lock), so I'm reluctant to give R-b.

@@ -857,10 +849,10 @@ static void job_completed_txn_success(Job *job)
           assert(other_job->ret == 0);
       }
-    job_txn_apply(txn, job_transition_to_pending, false);
+    job_txn_apply(txn, job_transition_to_pending);
       /* If no jobs need manual finalization, automatically do so */
-    if (job_txn_apply(txn, job_needs_finalize, false) == 0) {
+    if (job_txn_apply(txn, job_needs_finalize) == 0) {
           job_do_finalize(job);

That said, the change makes sense here: since the first direct call to
job_txn_apply() did not need to lock, why should the second indirect call
through job_do_finalize() need it?

Or, is the fix to have job_do_finalize() gain a bool parameter, where
job_completed_txn_success() would pass false to that parameter, while
job_finalize() would pass true (again, going back to auditing whether all
callers of job_finalize() have already acquired the context).

There are two callers of job_finalize():

Okay, so the audit would have been easier if I had actually tried grepping for it. Still, it can't hurt to make the commit message give more details on what you checked, to make it easier for the reviewers.


     address@hidden:~/work/qemu [master]$ git grep -w -A1 '^\s*job_finalize'
     blockdev.c:    job_finalize(&job->job, errp);
     blockdev.c-    aio_context_release(aio_context);
     --
     job-qmp.c:    job_finalize(job, errp);
     job-qmp.c-    aio_context_release(aio_context);

Yep, that's pretty conclusive that the context is already held by all callers.

Reviewed-by: Eric Blake <address@hidden>

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]