qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/3] block: wait for job callback in block_job_c


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH 2/3] block: wait for job callback in block_job_cancel_sync
Date: Thu, 19 Apr 2012 11:31:10 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Wed, Apr 18, 2012 at 03:12:02PM +0200, Paolo Bonzini wrote:
> The limitation on not having I/O after cancellation cannot really be
> held.  Even streaming has a very small race window where you could
> cancel a job and have it report completion.  If this window is hit,
> bdrv_change_backing_file() will yield and possibly cause accesses to
> dangling pointers etc.
> 
> So, let's just assume that we cannot know exactly what will happen
> after the coroutine has set busy to false.  We can set a very lax
> condition:
> 
> - if we cancel the job, the coroutine won't set it to false again
> (and hence will not call co_sleep_ns again).
> 
> - block_job_cancel_sync will wait for the coroutine to exit, which
> pretty much ensures no race.
> 
> Instead, we put a very strict condition on what to do while busy =
> false.  We track the coroutine that executes the job and reenter it
> (thus cancelling a wait for example) before block_job_cancel restarts.
> Thus you cannot really do anything but co_sleep_ns at that time.
> 
> This patch also drains the I/O *before* canceling the job, so that
> block_job_cancel is quite sure to find the coroutine in quiescent
> state (busy = false).  For mirroring, this means that the job will
> complete itself with a consistent view of the disk.
> 
> Signed-off-by: Paolo Bonzini <address@hidden>
> ---
>  block.c        |   41 +++++++++++++++++++++++++++++++++++++++--
>  block/stream.c |    7 +++----
>  block_int.h    |   17 ++++++++++++-----
>  3 files changed, 54 insertions(+), 11 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 9c7d896..48f4414 100644
> --- a/block.c
> +++ b/block.c
> @@ -4217,7 +4217,15 @@ int block_job_set_speed(BlockJob *job, int64_t value)
> 
>  void block_job_cancel(BlockJob *job)
>  {
> +    /* Complete all guest I/O before cancelling the job, so that if the
> +     * job chooses to complete itself it will do so with a consistent
> +     * view of the disk.
> +     */
> +    bdrv_drain_all();
>      job->cancelled = true;
> +    if (job->co && !job->busy) {
> +        qemu_coroutine_enter(job->co, NULL);
> +    }
>  }

block_job_cancel() is not supposed to block.  Now it will wait however
long it takes to complete in-flight I/O.  It has become synchronous and
therefore the point of having a completion/callback is gone.

If there really is no asynchronous solution it would be cleaner to throw
away all the cancellation/completion logic and just do it synchronously.

Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]