qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/17] Improve qcow2 all-zero detection


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH 00/17] Improve qcow2 all-zero detection
Date: Wed, 5 Feb 2020 17:58:57 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1

05.02.2020 17:43, Vladimir Sementsov-Ogievskiy wrote:
05.02.2020 17:22, Eric Blake wrote:
On 2/5/20 3:04 AM, Vladimir Sementsov-Ogievskiy wrote:

[repo.or.cz appears to be down as I type this; I'll post a link to a
repository later when it comes back up]

Now up
https://repo.or.cz/qemu/ericb.git/shortlog/refs/tags/qcow2-all-zero-v1



I have several ideas around it.

1. For generic block layer.
Did you consider as alternative to BDRV_ZEO_OPEN, to export the
information through normal block_status? So, if we have the
information, that disk is all-zero, we can always add _ZERO
flag to block-status result.

Makes sense.

And in generic bdrv_is_all_zeroes(),
we can just call block_status(0, disk_size), which will return
ZERO and n=disk_size if driver supports all-zero feature and is
all-zero now.

Less obvious.  block_status is not required to visit the entire disk, even if 
the entire disk is all zero.  For example, qcow2 visits at most one L2 page in 
a call (if the request spans L1 entries, it will be truncated at the boundary, 
even if the region before and after the boundary have the same status).  I'm 
also worried if we still have 32-bit limitations in block_status (ideally, 
we've fixed things to support 64-bit status where possible, but I'm not 
counting on it).

Not required, but why not doing it? If we have information that all disk is of 
the same ZERO status, no reasons to not reply on block_status(0, disk_size) 
with smaller n.


I think block-status is a native way for such information, and I
think that we anyway want to come to support of 64bit block-status
for qcow2 and nbd.

Block status requires an O(n) loop over the disk, where n is the number of 
distinct extents possible.  If you get lucky, and block_status(0,size) returns 
a single extent, then yes that can feed the 'is_zeroes' request.  Similarly, a 
single return of non-zero data can instantly tell you that 'is_zeroes' is 
false.  But given that drivers may break up their response on convenient 
boundaries, such as qcow2 on L1 entry granularity, you cannot blindly assume 
that a return of zero data for smaller than the requested size implies non-zero 
data, only that there is insufficient information to tell if the disk is 
all_zeroes without querying further block_status calls, and that's where you 
lose out on the speed compared to just being told up-front from an 'is_zero' 
call.

Yes. But how is it worse than BDRV_ZERO_OPEN? With one block_status call we 
have the same information. If on block_status(0, disk_size) driver replies with 
ZERO but smaller than disk_size, it means that either disk is not all-zero, or 
driver doesn't support 'fast whole-disk zero check' feature, which is equal to 
not supporting BDRV_ZERO_OPEN.



2. For NBD
Again, possible alternative is BLOCK_STATUS, but we need 64bit
commands for it. I plan to send a proposal anyway. Still, nothing
bad in two possible path of receiving all-zero information.
And even with your NBD extension, we can export this information
through block-status [1.]

Yes, having 64-bit BLOCK_STATUS in NBD is orthogonal to this, but both ideas 
are independently useful, and as the level of difficulty in implementing things 
may vary, it is conceivable to have both a server that provides 'is_zero' but 
not BLOCK_STATUS, and a server that provides 64-bit BLOCK_STATUS but not 
'is_zero'.


3. For qcow2
Hmm. Here, as I understand, than main case is freshly created qcow2,
which is fully-unallocated. To understand that it is empty, we
need only to check all L1 entries. And for empty L1 table it is fast.
So we don't need any qcow2 format improvement to check it.

The benefit of this patch series is that it detects preallocated qcow2 images 
as all_zero.  What's more, scanning all L1 entries is O(n), but detecting an 
autoclear all_zero bit is O(1).  Your proposed L1 scan is accurate for fewer 
cases, and costs more time.

Ah yes, somehow I thought that L1 is not allocated for fresh image..

Hmm, than possibly we need two new top-level flags: "all-zero" and 
"all-unallocated"..


It make sense only with incompatible semantics. With autoclean semantics it's 
better to have one 'all-allocated-are-zero' and don't care.


--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]