qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/8] block/vdi: return ZERO block-status when appropriate


From: Eric Blake
Subject: Re: [PATCH 1/8] block/vdi: return ZERO block-status when appropriate
Date: Wed, 6 May 2020 08:57:20 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0

On 5/6/20 4:25 AM, Vladimir Sementsov-Ogievskiy wrote:
In case of !VDI_IS_ALLOCATED[], we do zero out the corresponding chunk
of qiov. So, this should be reported as ZERO.

This part makes sense outright, since vdi does not allow backing files, so all reads of a VDI disk result in data rather than deferral to another layer, and we indeed read zero in this case. However, it _is_ a behavior change: previously, 'qemu-io -c map' on a vdi image will show unallocated portions, but with this change, the entire image now shows as allocated. You need to call out this fact in the commit message, documenting that it is intentional (it matches what we do for posix files: since neither files nor vdi support backing images, we guarantee that all read attempts will be satisfied by this layer rather than deferring to a backing layer, and thus can treat all data as allocated in this layer, regardless of whether it is sparse).

Do any existing iotests need an update to reflect change in output? If not, should we have an iotest covering it?


After block-status update, it never reports 0, so setting
unallocated_blocks_are_zero doesn't make sense. Drop it.

This is a harder claim. To prove it, we need to inspect all callers that use unallocated_blocks_are_zero, to see their intent. Fortunately, the list is small - a mere 2 clients!

block/io.c:bdrv_co_block_status(): if .bdrv_co_block_status sets either _DATA or _ZERO, block status adds _ALLOCATED; but if the driver did not set either, we use bdrv_unallocated_blocks_are_zero() in order to set _ZERO but not _ALLOCATED. This is the code that explains the behavior change mentioned above (now that vdi is reporting _ZERO instead of 0, block_status is appending _ALLOCATED instead of querying bdrv_unallocated_blocks_are_zero). But you are correct that it does not change where _ZERO is reported.

qemu-img.c:img_convert(): calls bdrv_get_info() up front and caches what it learned from unallocated_blocks_are_zero about the target; later on, in convert_iteration_sectors(), it uses this information to optimize the case where the source reads as zero, but the target has a backing file and the range being written lies beyond the end of the backing file. That is, given the following scenario:

qemu-img convert input -B backing output
input:   ABCD-0E
backing: ACBD

the optimization lets us produce:
output:  -BC---E

instead of the larger:
output:  -BC--0E

Do we have any iotests that cover this particular scenario, to prove that our optimization does indeed result in a smaller image file without explicit zeros written in the tail? Or put another way, if we were to disable the post_backing_zero optimization in convert_iteration_sectors(), would any iotests fail? If not, we should really enhance our testsuite.

With that said, we could just as easily achieve the optimization of the conversion to the tail of a destination with short backing file by checking block_status to see whether the tail already reads as all zeroes, rather than relying on unallocated_blocks_are_zero. Even if querying block_status is a slight pessimization in time, it would avoid regressing in the more important factor of artificially bloating the destination. I would _really_ love to see a patch to qemu-img.c reworking the logic to avoid the use of unallocated_blocks_are_zero first, prior to removing the setting of that field in the various drivers. Once bdrv_co_block_status() remains as the only client of the field, then I could accept this patch with a better commit message about the intentional change in block_status _ALLOCATION behavior.


Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
---
  block/vdi.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 0c7835ae70..83471528d2 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -334,7 +334,6 @@ static int vdi_get_info(BlockDriverState *bs, 
BlockDriverInfo *bdi)
      logout("\n");
      bdi->cluster_size = s->block_size;
      bdi->vm_state_offset = 0;
-    bdi->unallocated_blocks_are_zero = true;
      return 0;
  }
@@ -536,7 +535,7 @@ static int coroutine_fn vdi_co_block_status(BlockDriverState *bs,
      *pnum = MIN(s->block_size - index_in_block, bytes);
      result = VDI_IS_ALLOCATED(bmap_entry);
      if (!result) {
-        return 0;
+        return BDRV_BLOCK_ZERO;
      }
*map = s->header.offset_data + (uint64_t)bmap_entry * s->block_size +


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]