qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 05/12] mirror: perform COW if the cluster siz


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH v2 05/12] mirror: perform COW if the cluster size is bigger than the granularity
Date: Fri, 18 Jan 2013 16:13:14 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0

Am 16.01.2013 18:31, schrieb Paolo Bonzini:
> When mirroring runs, the backing files for the target may not yet be
> ready.  However, this means that a copy-on-write operation on the target
> would fill the missing sectors with zeros.  Copy-on-write only happens
> if the granularity of the dirty bitmap is smaller than the cluster size
> (and only for clusters that are allocated in the source after the job
> has started copying).  So far, the granularity was fixed to 1MB; to avoid
> the problem we detected the situation and required the backing files to
> be available in that case only.
> 
> However, we want to lower the granularity for efficiency, so we need
> a better solution.  The solution is to always copy a whole cluster the
> first time it is touched.  The code keeps a bitmap of clusters that
> have already been allocated by the mirroring job, and only does "manual"
> copy-on-write if the chunk being copied is zero in the bitmap.
> 
> Signed-off-by: Paolo Bonzini <address@hidden>
> ---
>         v1->v2: rebased for moved include files
> 
>  block/mirror.c             |   60 +++++++++++++++++++++++++++++++++++++------
>  blockdev.c                 |   15 ++---------
>  tests/qemu-iotests/041     |   21 +++++++++++++++
>  tests/qemu-iotests/041.out |    4 +-
>  trace-events               |    1 +
>  5 files changed, 78 insertions(+), 23 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 20cb1e7..ee45e2e 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -15,6 +15,7 @@
>  #include "block/blockjob.h"
>  #include "block/block_int.h"
>  #include "qemu/ratelimit.h"
> +#include "qemu/bitmap.h"
>  
>  enum {
>      /*
> @@ -36,6 +37,8 @@ typedef struct MirrorBlockJob {
>      bool synced;
>      bool should_complete;
>      int64_t sector_num;
> +    size_t buf_size;
> +    unsigned long *cow_bitmap;
>      HBitmapIter hbi;
>      uint8_t *buf;
>  } MirrorBlockJob;
> @@ -60,7 +63,7 @@ static int coroutine_fn mirror_iteration(MirrorBlockJob *s,
>      BlockDriverState *target = s->target;
>      QEMUIOVector qiov;
>      int ret, nb_sectors;
> -    int64_t end;
> +    int64_t end, sector_num, cluster_num;
>      struct iovec iov;
>  
>      s->sector_num = hbitmap_iter_next(&s->hbi);
> @@ -71,22 +74,41 @@ static int coroutine_fn mirror_iteration(MirrorBlockJob 
> *s,
>          assert(s->sector_num >= 0);
>      }
>  
> +    /* If we have no backing file yet in the destination, and the cluster 
> size
> +     * is very large, we need to do COW ourselves.  The first time a cluster 
> is
> +     * copied, copy it entirely.
> +     *
> +     * Because both BDRV_SECTORS_PER_DIRTY_CHUNK and the cluster size are
> +     * powers of two, the number of sectors to copy cannot exceed one 
> cluster.
> +     */
> +    sector_num = s->sector_num;
> +    nb_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
> +    cluster_num = sector_num / BDRV_SECTORS_PER_DIRTY_CHUNK;
> +    if (s->cow_bitmap && !test_bit(cluster_num, s->cow_bitmap)) {
> +        trace_mirror_cow(s, sector_num);
> +        bdrv_round_to_clusters(s->target,
> +                               sector_num, BDRV_SECTORS_PER_DIRTY_CHUNK,
> +                               &sector_num, &nb_sectors);
> +        bitmap_set(s->cow_bitmap, sector_num / BDRV_SECTORS_PER_DIRTY_CHUNK,
> +                   nb_sectors / BDRV_SECTORS_PER_DIRTY_CHUNK);

Here the bit in the cow_bitmap is set before the COW has actually been
performed. It could still fail.

> +    }
> +
>      end = s->common.len >> BDRV_SECTOR_BITS;
> -    nb_sectors = MIN(BDRV_SECTORS_PER_DIRTY_CHUNK, end - s->sector_num);
> -    bdrv_reset_dirty(source, s->sector_num, nb_sectors);
> +    nb_sectors = MIN(nb_sectors, end - sector_num);
> +    bdrv_reset_dirty(source, sector_num, nb_sectors);
>  
>      /* Copy the dirty cluster.  */
>      iov.iov_base = s->buf;
>      iov.iov_len  = nb_sectors * 512;
>      qemu_iovec_init_external(&qiov, &iov, 1);
>  
> -    trace_mirror_one_iteration(s, s->sector_num, nb_sectors);
> -    ret = bdrv_co_readv(source, s->sector_num, nb_sectors, &qiov);
> +    trace_mirror_one_iteration(s, sector_num, nb_sectors);
> +    ret = bdrv_co_readv(source, sector_num, nb_sectors, &qiov);
>      if (ret < 0) {
>          *p_action = mirror_error_action(s, true, -ret);
>          goto fail;
>      }
> -    ret = bdrv_co_writev(target, s->sector_num, nb_sectors, &qiov);
> +    ret = bdrv_co_writev(target, sector_num, nb_sectors, &qiov);
>      if (ret < 0) {
>          *p_action = mirror_error_action(s, false, -ret);
>          s->synced = false;
> @@ -96,7 +118,7 @@ static int coroutine_fn mirror_iteration(MirrorBlockJob *s,
>  
>  fail:
>      /* Try again later.  */
> -    bdrv_set_dirty(source, s->sector_num, nb_sectors);
> +    bdrv_set_dirty(source, sector_num, nb_sectors);

If it does, we mark the whole cluster dirty now, but in the cow_bitmap
it's still marked at present on the target. When restarting the job,
wouldn't it copy only the start of the cluster next time and corrupt the
rest of it?

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]