qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH] mirror: Confirm we're quiesced onl


From: Eric Blake
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH] mirror: Confirm we're quiesced only if the job is paused or cancelled
Date: Thu, 7 Mar 2019 11:15:48 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1

On 3/7/19 8:03 AM, Sergio Lopez wrote:
> While child_job_drained_begin() calls to job_pause(), the job doesn't
> actually transition between states until it runs again and reaches a
> pause point. This means bdrv_drained_begin() may return with some jobs
> using the node still having 'busy == true'.
> 
> As a consequence, block_job_detach_aio_context() may get into a
> deadlock, waiting for the job to be actually paused, while the coroutine
> servicing the job is yielding and doesn't get the opportunity to get
> scheduled again. This situation can be reproduced by issuing a
> 'block-commit' immediately followed by a 'device_del'.
> 
> To ensure bdrv_drained_begin() only returns when the jobs have been
> paused, we change mirror_drained_poll() to only confirm it's quiesced
> when job->paused == true and there aren't any in-flight requests, except
> if we reached that point by a drained section initiated by the
> mirror/commit job itself.
> 
> The other block jobs shouldn't need any changes, as the default
> drained_poll() behavior is to only confirm it's quiesced if the job is
> not busy or completed.
> 
> Signed-off-by: Sergio Lopez <address@hidden>
> ---
>  block/mirror.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 

> @@ -1119,6 +1126,16 @@ static void coroutine_fn mirror_pause(Job *job)
>  static bool mirror_drained_poll(BlockJob *job)
>  {
>      MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
> +
> +    /* If the job isn't paused nor cancelled, we can't be sure that it won't
> +     * issue more requets. We make an exception if we've reached this point

requests

> +     * from one of our own drain sections, to avoid a deadlock waiting for
> +     * ourselves.
> +     */
> +    if (!s->common.job.paused && !s->common.job.cancelled && !s->in_drain) {
> +        return true;
> +    }
> +
>      return !!s->in_flight;
>  }
>  
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]