qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 1/5] blockjob: do not allow coroutine double ent


From: Stefan Hajnoczi
Subject: Re: [Qemu-block] [PATCH 1/5] blockjob: do not allow coroutine double entry or entry-after-completion
Date: Mon, 20 Nov 2017 11:16:53 +0000
User-agent: Mutt/1.9.1 (2017-09-22)

On Sun, Nov 19, 2017 at 09:46:42PM -0500, Jeff Cody wrote:
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -291,10 +291,10 @@ void block_job_start(BlockJob *job)
>  {
>      assert(job && !block_job_started(job) && job->paused &&
>             job->driver && job->driver->start);
> -    job->co = qemu_coroutine_create(block_job_co_entry, job);
>      job->pause_count--;
>      job->busy = true;
>      job->paused = false;
> +    job->co = qemu_coroutine_create(block_job_co_entry, job);
>      bdrv_coroutine_enter(blk_bs(job->blk), job->co);
>  }
>  

This hunk makes no difference.  The coroutine is only entered by
bdrv_coroutine_enter() so the order of job field initialization doesn't
matter.

> @@ -797,11 +797,14 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType 
> type, int64_t ns)
>          return;
>      }
>  
> -    job->busy = false;
> +    /* We need to leave job->busy set here, because when we have
> +     * put a coroutine to 'sleep', we have scheduled it to run in
> +     * the future.  We cannot enter that same coroutine again before
> +     * it wakes and runs, otherwise we risk double-entry or entry after
> +     * completion. */
>      if (!block_job_should_pause(job)) {
>          co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns);
>      }
> -    job->busy = true;
>  
>      block_job_pause_point(job);

This leaves a stale doc comment in include/block/blockjob_int.h:

  /**
   * block_job_sleep_ns:
   * @job: The job that calls the function.
   * @clock: The clock to sleep on.
   * @ns: How many nanoseconds to stop for.
   *
   * Put the job to sleep (assuming that it wasn't canceled) for @ns
   * nanoseconds.  Canceling the job will interrupt the wait immediately.
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   */
  void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns);

This raises questions about the ability to cancel sleep:

1. Does something depend on cancelling sleep?

2. Did cancellation work properly in commit
   4513eafe928ff47486f4167c28d364c72b5ff7e3 ("block: add
   block_job_sleep_ns") and was it broken afterwards?

It is possible to fix the recursive coroutine entry without losing sleep
cancellation.  Whether it's worth the trouble depends on the answers to
the above questions.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]