qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v1] [RFC] qcow2: add compression type feature


From: Denis Plotnikov
Subject: Re: [Qemu-devel] [PATCH v1] [RFC] qcow2: add compression type feature
Date: Fri, 17 May 2019 09:04:13 +0000


On 16.05.2019 17:42, Eric Blake wrote:
> On 5/16/19 8:48 AM, Denis Plotnikov wrote:
>> The patch adds some preparation parts for incompatible compression type
>> feature into QCOW2 header that indicates that *all* compressed clusters
>> must be (de)compressed using a certain compression type.
>>
>> It is implied that the compression type is set on the image creation and
>> can be changed only later by image conversion, thus the only compression
>> algorithm is used for the image.
>>
>> The plan is to add support for ZSTD and then may be something more effective
>> in the future.
>>
>> ZSTD compression algorithm consumes 3-5 times less CPU power with a
>> comparable compression ratio with zlib. It would be wise to use it for
>> data compression e.g. for backups.
>>
>> The default compression is ZLIB.
>>
>> Signed-off-by: Denis Plotnikov <address@hidden>
>> ---
> 
>> +++ b/docs/interop/qcow2.txt
>> @@ -109,7 +109,11 @@ in the description of a field.
>>                                   An External Data File Name header 
>> extension may
>>                                   be present if this bit is set.
>>   
>> -                    Bits 3-63:  Reserved (set to 0)
>> +                    Bit 3:      Compression type bit. If the bit is set, 
>> then the
>> +                                type of compression the image uses is set 
>> in the
>> +                                header extension
> 
> I'd call out 'Compression type' header extension by name, to make it
> more obvious.  Is it an error if bit 3 is set but the compression header
> is not present? Is it an error if the compression header is present but
> bit 3 is not set?
yes to both, the bit can't exist without the header and vise versa. This 
also implies that there is no use in the bit set and compression type = 
ZLIB. This is ensure that older qemu(s) can work with the images that 
use ZLIB without problems. Is there any drawbacks in this approach?
> 
>> +
>> +                    Bits 4-63:  Reserved (set to 0)
>>   
>>            80 -  87:  compatible_features
>>                       Bitmask of compatible features. An implementation can
>> @@ -175,6 +179,7 @@ be stored. Each extension has a structure like the 
>> following:
>>                           0x23852875 - Bitmaps extension
>>                           0x0537be77 - Full disk encryption header pointer
>>                           0x44415441 - External data file name string
>> +                        0x434D5052 - Compression type extension
> 
> Our earlier magic numbers were probably created as random numbers and
> contain 8-bit values, to make them less likely to appear naturally in
> other parts of the file and thus less likely to be misinterpreted.  But
> that's not a requirement, and I see that you followed the lead of "DATA"
> and created "CMPR" for yours.  Works for me :)
> 
>>                           other      - Unknown header extension, can be 
>> safely
>>                                        ignored
>>   
>> @@ -771,3 +776,21 @@ In the image file the 'enabled' state is reflected by 
>> the 'auto' flag. If this
>>   flag is set, the software must consider the bitmap as 'enabled' and start
>>   tracking virtual disk changes to this bitmap from the first write to the
>>   virtual disk. If this flag is not set then the bitmap is disabled.
>> +
>> +
>> +== Compression type extension ==
>> +
>> +The compression type extension is an optional header extension. It stores 
>> the
> 
> Could probably do a better job at describing when the header is optional
> vs. mandatory.
> 
>> +ID of the compressor which has to be used to compress/decompress disk 
>> clusters.
>> +The compression type is used for all disk cluster. Two clusters of the image
>> +couldn't be compressed with different compressors.
> 
> Wording suggestion: A single compression type is applied to all
> compressed disk clusters, with no way to change compression types per
> cluster.
> 
Nice one!
> But is that a hard requirement? Since this is already an incompatible
> feature extension, we could have a compression type that states that
> each compressed cluster is self-describing via a 1-byte prefix (yes, it
> means compression is not quite as dense, but probably not an issue).
> 
> Something like: in the image header, we have compression type 1 = zlib,
> compression type 2 = zstd, etc, each of which treat all compressed
> clusters as-is with no further per-cluster headers. Or, in the image
> header, we have compression type 255 = per-cluster, at which point a
> compressed cluster is now represented as: [1-byte prefix] [tail], where
> the one-byte prefix is 1 = zlib, 2 = zstd, etc (but not 255), and then
> the tail is decoded with the appropriate algorithm. In this way, it
> might even be possible to encode different clusters with an optimal
> algorithm per cluster, and thus create an image that requires both zlib
> and zstd to be fully read.
> 
> I'm not sure if we need that much complexity, but just throwing it out
> there for thought.
Yes, I thought about per-cluster compression type as well but I have a 
few concerns about that.
1. So far can't come up with approach to defining the best algorithm to 
the specific chunk of data in advance before actually compressing it 
with different compressors and comparing the sizes afterwards 
(compression speed is also important).
May be it's better to give users ability to decide on their own i.e. to 
choose a policy on disk creation:
the fastest compression, minimum image on disk size or balanced.

2. The image still should be converted to use in older qemu-s. Can't get 
rid of the conversion
> 
>> +
>> +The compression type can be set on the image creation. The only way to 
>> change
>> +the compression type is to convert the image explicitly.
>> +
>> +Available compression types:
>> +    ID    0: ZLIB (gzip)
>> +          1: ZSTD
>> +
>> +The default compression type is ZLIB. When ZLIB is used the compression type
>> +header extension is not present.
> 
> Here's where we have to think about back-compat. If zlib is used, and
> the compression type header is present, must incompatible bit 3 be set?
> Do we want to permit images that have incompatible bit 3 set and zlib
> explicitly mentioned? 
No
> Or are you making a hard requirement that if zlib
> is chosen, incompatible bit 3 must be absent and no compression header
> should be set?
Yes
> Or is it okay for the compression header to be present
> and incompatible bit 3 clear, but only when compression type 0 is
> chosen?
No
> Let's spell out exactly what we want, probably with a goal of
> minimizing the number of situations where an incompatible bit must be
> set (as that makes it harder to work with images in older software).
> 
Ok
> Does the compression type really have to be chosen at image creation, or
> can the decision be deferred until the time that the first compressed
> cluster is written?  
Yes, it could be that way (even better). If the compression type is 
changed in run-time and it's no ZLIB the extension header is written and 
the incompatible bit is set...
> You could implement things to state that if
> incompatible bit 3 is set but the compression header is absent, then
> there must not be any compressed clusters in the image; as soon as the
> first compressed cluster is written, then the compression header must
> also be written (even if it explicitly calls out zlib), to make it
> easier for new software to tell at a glance if the image has ever
> contained compressed clusters at least once in the past.
... yeah, when the first cluster is written both the bit and the header 
is written if not ZLIB but need a flag whether we have at least one 
cluster compressed
> 
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index 7ccbfff9d0..8eebcc728b 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -698,6 +698,7 @@
>>   { 'struct': 'BlockMeasureInfo',
>>     'data': {'required': 'int', 'fully-allocated': 'int'} }
>>   
>> +
>>   ##
> 
> Why the added blank line?
> 
>>   # @query-block:
>>   #
>> @@ -5257,3 +5258,16 @@
>>     'data' : { 'node-name': 'str',
>>                'iothread': 'StrOrNull',
>>                '*force': 'bool' } }
>> +
>> +##
>> +# @Qcow2CompressionType:
>> +#
>> +# Compression type used in qcow2 image file
>> +#
>> +# @zlib - gzip compressor
>> +# @zstd - zstd compression
>> +#
>> +# Since: 4.0
> 
> You've missed 4.0; this should be 4.1.
Will go with 4.1

Thanks!
Denis
> 
>> +##
>> +{ 'enum': 'Qcow2CompressionType',
>> +  'data': [ 'zlib', 'zstd' ] }
>>
> 

-- 
Best,
Denis

reply via email to

[Prev in Thread] Current Thread [Next in Thread]