qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [RFC v4 12/21] blockjobs: ensure abort is called for ca


From: Kevin Wolf
Subject: Re: [Qemu-block] [RFC v4 12/21] blockjobs: ensure abort is called for cancelled jobs
Date: Wed, 28 Feb 2018 17:05:18 +0100
User-agent: Mutt/1.9.1 (2017-09-22)

Am 24.02.2018 um 00:51 hat John Snow geschrieben:
> Presently, even if a job is canceled post-completion as a result of
> a failing peer in a transaction, it will still call .commit because
> nothing has updated or changed its return code.
> 
> The reason why this does not cause problems currently is because
> backup's implementation of .commit checks for cancellation itself.
> 
> I'd like to simplify this contract:
> 
> (1) Abort is called if the job/transaction fails
> (2) Commit is called if the job/transaction succeeds
> 
> To this end: A job's return code, if 0, will be forcibly set as
> -ECANCELED if that job has already concluded. Remove the now
> redundant check in the backup job implementation.

Thanks, another fix that addresses a comment I made on an earlier patch!

> We need to check for cancellation in both block_job_completed
> AND block_job_completed_single, because jobs may be cancelled between
> those two calls; for instance in transactions.
> 
> The check in block_job_completed could be removed, but there's no
> point in starting to attempt to succeed a transaction that we know
> in advance will fail.
> 
> This does NOT affect mirror jobs that are "canceled" during their
> synchronous phase. The mirror job itself forcibly sets the canceled
> property to false prior to ceding control, so such cases will invoke
> the "commit" callback.

Which is a sign that mirror is abusing the interface. But yes, we need
to keep it working.

> Signed-off-by: John Snow <address@hidden>
> ---
>  block/backup.c     |  2 +-
>  block/trace-events |  1 +
>  blockjob.c         | 19 +++++++++++++++----
>  3 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 7e254dabff..453cd62c24 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -206,7 +206,7 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob 
> *job, int ret)
>      BdrvDirtyBitmap *bm;
>      BlockDriverState *bs = blk_bs(job->common.blk);
>  
> -    if (ret < 0 || block_job_is_cancelled(&job->common)) {
> +    if (ret < 0) {
>          /* Merge the successor back into the parent, delete nothing. */
>          bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
>          assert(bm);
> diff --git a/block/trace-events b/block/trace-events
> index 266afd9e99..5e531e0310 100644
> --- a/block/trace-events
> +++ b/block/trace-events
> @@ -5,6 +5,7 @@ bdrv_open_common(void *bs, const char *filename, int flags, 
> const char *format_n
>  bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
>  
>  # blockjob.c
> +block_job_completed(void *job, int ret, int jret) "job %p ret %d corrected 
> ret %d"
>  block_job_state_transition(void *job,  int ret, const char *legal, const 
> char *s0, const char *s1) "job %p (ret: %d) attempting %s transition 
> (%s-->%s)"
>  block_job_apply_verb(void *job, const char *state, const char *verb, const 
> char *legal) "job %p in state %s; applying verb %s (%s)"
>  
> diff --git a/blockjob.c b/blockjob.c
> index 4d29391673..ef17dea004 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -384,13 +384,22 @@ void block_job_start(BlockJob *job)
>      bdrv_coroutine_enter(blk_bs(job->blk), job->co);
>  }
>  
> +static void block_job_update_rc(BlockJob *job)
> +{
> +    if (!job->ret && block_job_is_cancelled(job)) {
> +        job->ret = -ECANCELED;
> +    }
> +    if (job->ret) {
> +        block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
> +    }
> +}
> +
>  static void block_job_completed_single(BlockJob *job)
>  {
>      assert(job->completed);
>  
> -    if (job->ret || block_job_is_cancelled(job)) {
> -        block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
> -    }
> +    /* Ensure abort is called for late-transactional failures */
> +    block_job_update_rc(job);
>  
>      if (!job->ret) {
>          if (job->driver->commit) {
> @@ -898,7 +907,9 @@ void block_job_completed(BlockJob *job, int ret)
>      assert(blk_bs(job->blk)->job == job);
>      job->completed = true;
>      job->ret = ret;
> -    if (ret < 0 || block_job_is_cancelled(job)) {
> +    block_job_update_rc(job);

Hmm... We are transitioning to ABORTED here now. This means that in
block_job_completed_single() we get an ABORTED -> ABORTED transition.
A bit ugly, but not really a problem. If you decide to remove the
s0 == s1 special case in block_job_state_transition(), you'll need to
allow this transition in the table.

Reviewed-by: Kevin Wolf <address@hidden>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]