qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead o


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers
Date: Wed, 28 Aug 2019 21:50:51 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Drop write notifiers and use filter node instead.
> 
> = Changes =
> 
> 1. add filter-node-name argument for backup qmp api. We have to do it
> in this commit, as 257 needs to be fixed.

I feel a bit bad about it not being an implicit node.  But I know you
don’t like them, they’re probably just broken altogether and because
libvirt doesn’t use backup (yet), probably nobody cares.

> 2. there no move write notifiers here, so is_write_notifier parameter

s/there/there are/, I suppose?

> is dropped from block-copy paths.
> 
> 3. Intersecting requests handling changed, now synchronization between
> backup-top, backup and guest writes are all done in block/block-copy.c
> and works as follows:
> 
> On copy operation, we work only with dirty areas. If bits are dirty it
> means that there are no requests intersecting with this area. We clear
> dirty bits and take bdrv range lock (bdrv_co_try_lock) on this area to
> prevent further operations from interaction with guest (only with
> guest, as neither backup nor backup-top will touch non-dirty area). If
> copy-operation failed we set dirty bits back together with releasing
> the lock.
> 
> The actual difference with old scheme is that on guest writes we
> don't lock the whole region but only dirty-parts, and to be more
> precise: only dirty-part we are currently operate on. In old scheme
> guest write to non-dirty area (which may be safely ignored by backup)
> may wait for intersecting request, touching some other area which is
> dirty.
> 
> 4. To sync with in-flight requests at job finish we now have drained
> removing of the filter, we don't need rw-lock.
> 
> = Notes =
> 
> Note the consequence of three objects appearing: backup-top, backup job
> and block-copy-state:
> 
> 1. We want to insert backup-top before job creation, to behave similar
> with mirror and commit, where job is started upon filter.
> 
> 2. We also have to create block-copy-state after filter injection, as
> we don't want it's source child be replaced by fitler. Instead we want

s/it's/its/, s/filter/filter/ (although “fitler” does have an amusing
ring to it)

> to keep BCS.source to be real source node, as we want to use
> bdrv_co_try_lock in CBW operations and it can't be used on filter, as
> on filter we already have in-flight (write) request from upper layer.

Reasonable, even more so as I suppose BCS.source should eventually be a
BdrvChild of backup-top.

What looks wrong is that the sync_bitmap is created on the source (“bs”
in backup_job_create()), but backup_cleanup_sync_bitmap() still assumes
it is on blk_bs(job->common.blk) (which is no longer true).

> So, we firstly create inject backup-top, then create job and BCS. BCS
> is the latest just to not create extra variable for it. Finally we set
> bcs for backup-top filter.
> 
> = Iotest changes =
> 
> 56: op-blocker doesn't shot now, as we set it on source, but then check

s/shot/show/?

> on filter, when trying to start second backup, so error caught in
> test_dismiss_collision is changed. It's OK anyway, as this test-case
> seems to test that after some collision we can dismiss first job and
> successfully start the second one. So, the change is that collision is
> changed from op-blocker to file-posix locks.

Well, but the op blocker belongs to the source, which should be covered
by internal permissions.  The fact that it now shows up as a file-posix
error thus shows that the conflict also moves from the source to the
target.  It’s OK because we actually don’t have a conflict on the source.

But I wonder whether it would be better for test_dismiss_collision() to
do a blockdev-backup instead where we can see the collision on the target.

Hm.  On second thought, why do we even get a conflict on the target?
block-copy does share the WRITE permission for it...

> However, it's obvious now that we'd better drop this op-blocker at all
> and add a test-case for two backups from one node (to different
> destinations) actually works. But not in these series.
> 
> 257: The test wants to emulate guest write during backup. They should
> go to filter node, not to original source node, of course. Therefore we
> need to specify filter node name and use it.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
> ---
>  qapi/block-core.json       |   8 +-
>  include/block/block-copy.h |   2 +-
>  include/block/block_int.h  |   1 +
>  block/backup-top.c         |  14 +-
>  block/backup.c             | 113 +++-----------
>  block/block-copy.c         |  55 ++++---
>  block/replication.c        |   2 +-
>  blockdev.c                 |   1 +
>  tests/qemu-iotests/056     |   2 +-
>  tests/qemu-iotests/257     |   7 +-
>  tests/qemu-iotests/257.out | 306 ++++++++++++++++++-------------------
>  11 files changed, 230 insertions(+), 281 deletions(-)

Nice.

I don’t have much to object (for some reason, I feel a bit uneasy about
dropping all the request waiting code, but I suppose that is indeed
taken care of with the range locks now).

[...]

> diff --git a/block/backup.c b/block/backup.c
> index d927c63e5a..259a165405 100644
> --- a/block/backup.c
> +++ b/block/backup.c

[...]

> @@ -552,6 +480,9 @@ BlockJob *backup_job_create(const char *job_id, 
> BlockDriverState *bs,
>          backup_clean(&job->common.job);
>          job_early_fail(&job->common.job);
>      }
> +    if (backup_top) {
> +        bdrv_backup_top_drop(backup_top);
> +    }

This order is weird because backup-top still has a pointer to the BCS,
but we have already freed it at this point.

>  
>      return NULL;
>  }
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 6828c46ba0..f3102ec3ff 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -98,27 +98,32 @@ fail:
>   * error occurred, return a negative error number
>   */
>  static int coroutine_fn block_copy_with_bounce_buffer(
> -        BlockCopyState *s, int64_t start, int64_t end, bool 
> is_write_notifier,
> +        BlockCopyState *s, int64_t start, int64_t end,
>          bool *error_is_read, void **bounce_buffer)
>  {
>      int ret;
> -    int nbytes;
> -    int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
> +    int nbytes = MIN(s->cluster_size, s->len - start);
> +    void *lock = bdrv_co_try_lock(blk_bs(s->source), start, nbytes);
> +
> +    /*
> +     * Function must be called only on full-dirty region, so nobody touching 
> or
> +     * touched these bytes. Therefore, we must successfully get lock.

Which is OK because backup-top does not share the WRITE permission.  But
it is organized a bit weirdly, because it’s actually block-copy here
that relies on the WRITE permission not to be shared; which it itself
cannot do because backup-top needs it.

This of course results from the fact that backup-top and block-copy
should really use the same object to access the source, namely the
backing BdrvChild of backup-top.

Maybe there should be a comment somewhere that points this out?

> +     */
> +    assert(lock);

[...]

> @@ -128,13 +133,16 @@ static int coroutine_fn block_copy_with_bounce_buffer(
>          if (error_is_read) {
>              *error_is_read = false;
>          }
> -        goto fail;
> +        goto out;
>      }
>  
> -    return nbytes;
> -fail:
> -    bdrv_set_dirty_bitmap(s->copy_bitmap, start, s->cluster_size);
> -    return ret;
> +out:
> +    if (ret < 0) {
> +        bdrv_set_dirty_bitmap(s->copy_bitmap, start, s->cluster_size);
> +    }
> +    bdrv_co_unlock(lock);
> +
> +    return ret < 0 ? ret : nbytes;
>  

I feel like it would still be simpler to keep the separate fail path and
just duplicate the bdrv_co_unlock(lock) call.

[...]

> diff --git a/blockdev.c b/blockdev.c
> index fbef6845c8..f89e48fc79 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -3601,6 +3601,7 @@ static BlockJob *do_backup_common(BackupCommon *backup,
>      job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
>                              backup->sync, bmap, backup->bitmap_mode,
>                              backup->compress,
> +                            backup->filter_node_name,

Is this guaranteed to be NULL when not specified by the user?

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]