qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmap


From: John Snow
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH 03/17] spec: add qcow2-dirty-bitmaps specification
Date: Mon, 5 Oct 2015 20:09:59 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0


On 09/15/2015 12:24 PM, Eric Blake wrote:
> On 09/05/2015 10:43 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Persistent dirty bitmaps will be saved into qcow2 files. It may be used
>> as 'internal' bitmaps (for qcow2 drives) or as 'external' bitmaps for
>> other drives (there may be qcow2 file with zero disk size but with
>> several dirty bitmaps for other drives).
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>> ---
>>  docs/specs/qcow2.txt | 127 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 126 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>> index 121dfc8..5fc0365 100644
>> --- a/docs/specs/qcow2.txt
>> +++ b/docs/specs/qcow2.txt
>> @@ -103,7 +103,13 @@ in the description of a field.
>>                      write to an image with unknown auto-clear features if it
>>                      clears the respective bits from this field first.
>>  
>> -                    Bits 0-63:  Reserved (set to 0)
>> +                    Bit 0:      Dirty bitmaps bit. If this bit is set then
>> +                                there is a _consistent_ Dirty bitmaps 
>> extension
>> +                                in the image. If it is not set, but there 
>> is a
>> +                                Dirty bitmaps extension, its data should be
>> +                                considered as inconsistent.
> 
> Thanks for documenting this. I don't know that we use underscore for
> _emphasis_ anywhere else in the file, but I don't have any better
> suggestions.  Should you also require that it is an error if this bit is
> set but no Dirty bitmap extension header is present?
> 

An error, but one that can be safely corrected by any fsck-style
utility: clear the bit.

>> +
>> +                    Bits 1-63:  Reserved (set to 0)
>>  
>>           96 -  99:  refcount_order
>>                      Describes the width of a reference count block entry 
>> (width
>> @@ -123,6 +129,7 @@ be stored. Each extension has a structure like the 
>> following:
>>                          0x00000000 - End of the header extension area
>>                          0xE2792ACA - Backing file format name
>>                          0x6803f857 - Feature name table
>> +                        0x23852875 - Dirty bitmaps
>>                          other      - Unknown header extension, can be safely
>>                                       ignored
>>  
>> @@ -166,6 +173,24 @@ the header extension data. Each entry look like this:
>>                      terminated if it has full length)
>>  
>>  
>> +== Dirty bitmaps ==
>> +
>> +Dirty bitmaps is an optional header extension. It provides an ability to 
>> store
>> +dirty bitmaps in a qcow2 image. The fields are:
> 
> Might not hurt to remind the reader about the auto-clear feature bit
> mentioned earlier controlling whether this extension can be trusted as
> consistent.
> 
>> +
>> +          0 -  3:  nb_dirty_bitmaps
>> +                   The number of dirty bitmaps contained in the image. Valid
>> +                   values: 0 - 65535.
>> +
>> +          4 -  7:  dirty_bitmap_directory_size
>> +                   Size of the Dirty Bitmap Directory in bytes. Valid 
>> values:
>> +                   0 - 67108864 (= 1024 * nb_dirty_bitmaps).
> 
> Is it always going to be 1024 * nb_dirty_bitmaps? If so, why do we need
> a redundant field?  If not, then this wording needs help; from the rest
> of this text, it looks like you want "at most 1024 * nb_dirty_bitmaps".
>  Also, while Dirty Bitmap Directory entries are variable length (and
> thus a variable maximum), they do have a minimum size (so the minimum
> value for dirty_bitmap_directory_size must be larger than 0 unless
> nb_dirty_bitmaps is 0, in which case why would we have this header
> extension)
> 

Agree.

>> +
>> +          8 - 15:  dirty_bitmap_directory_offset
>> +                   Offset into the image file at which the Dirty Bitmap
>> +                   Directory starts. Must be aligned to a cluster boundary.
>> +
>> +
>>  == Host cluster management ==
>>  
>>  qcow2 manages the allocation of host clusters by maintaining a reference 
>> count
>> @@ -360,3 +385,103 @@ Snapshot table entry:
>>  
>>          variable:   Padding to round up the snapshot table entry size to the
>>                      next multiple of 8.
>> +
>> +
>> +== Dirty bitmaps ==
>> +
>> +The feature supports storing dirty bitmaps in a qcow2 image.
>> +
>> +=== Cluster mapping ===
>> +
>> +Dirty bitmaps are stored using a ONE-level structure for the mapping of
>> +bitmaps to host clusters. It is called Dirty Bitmap Table.
> 
> s/ONE/one/ (I didn't see the reason for the emphasis)
> 

Emphasis is likely because that's not how the cluster allocation
mechanism works in qcow2 otherwise. We're essentially storing data
straight into what would otherwise be the L1 table.

It's worth clarifying, in my opinion.

>> +
>> +The Dirty Bitmap Table has a variable size (stored in the Dirty Bitmap
> 
> s/The/Each/
> 

Yes -- and adding to this, I might request that the Dirty Bitmap
Directory is documented before the Dirty Bitmap "Table". That way the
spec reads very naturally from header, to directory, to data.

>> +Directory Entry) and may use multiple clusters, however it must be 
>> contiguous
>> +in the image file.
>> +
>> +Given an offset (in bytes) into the bitmap, the offset into the image file 
>> can
>> +be obtained as follows:
>> +
>> +    byte_offset =
>> +        dirty_bitmap_table[offset / cluster_size] + (offset % cluster_size)
>> +
>> +Taking into accout the granularity of the bitmap, an offset in bits into the
> 
> s/accout/account/
> 
>> +image file can be obtained like this:
>> +
>> +    bit_offset =
>> +        byte_offset(bit_nr / granularity / 8) * 8 + (bit_nr / granularity) 
>> % 8
>> +
>> +Here bit_nr is a number of "virtual" bit of the bitmap, which is covered by
>> +"physical" bit with number (bit_nr / granularity).
> 
> I got a bit lost on this sentence. Maybe an example would help?  Is the
> idea that every image has a certain number of clusters, one "virtual"
> bit per cluster, and then the bitmap compresses multiple clusters into
> one "physical" bit according to a compression ratio determined by the
> bitmap granularity?  That is, if I have an image with 64k clusters but
> 128k bitmap granularity, then each physical bit of the bitmap covers 2
> clusters as being dirty?
> 

Yes -- implementation-wise, we create a bitmap with "n" number of
"virtual bits" with a mapping of one "virtual bit" per "sector." (the
sector we use here is BDRV_SECTOR_BITS = 9, 512 bytes.)

If a granularity of, say, g=1 is chosen, (which would have been a user
specified granularity of 512 bytes) then the number of "virtual bits" is
synonymous with the number of stored/physical/actual bits. Otherwise,
Each bit represents a dirty flag for 2^g *sectors*.

However, in the qcow2 world, this is useless, because qcow2 cares about
clusters -- so knowing about a single dirty sector is not necessarily
useful to us at this level. We want to know, generally, about dirty
*clusters*.

The addition of new dirty bitmaps via QMP tries to match the current
cluster size (which itself is a 64K default) and failing that, will use
64K (which we hope is a sane guess as a default.)

So in the case of a cluster size of 64K and the granularity expertly
chosen to be 7, each physical bit represents 2^7 = 128 "virtual" bits
(which are secretly sectors). The nice property here is that one
physical bit winds up mapping pretty directly to one cluster.

Clear as mud?
:)

>> +
>> +Dirty Bitmap Table entry:
>> +
>> +    Bit  0 -  8:    Reserved
> 
> s/Reserved/Reserved, must be 0/
> 
>> +
>> +         9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
>> +                    cluster boundary. If the offset is 0, the cluster is
>> +                    unallocated, and should be read as all zeros.
>> +
>> +        56 - 63:    Reserved
> 
> and again (specifying the user must write 0 for now leaves the door open
> for extension)
> 
>> +
>> +=== Dirty Bitmap Directory ===
>> +
>> +Each dirty bitmap, saved in the image is described in the Dirty Bitmap
> 
> s/bitmap,/bitmap/
> s/in the Dirty/in a Dirty/
> 
>> +Directory entry. Dirty Bitmap Directory is a contiguous area in the image 
>> file,
>> +whose starting offset and length are given by the header extension fields
>> +dirty_bitmap_directory_offset and dirty_bitmap_directory_size. The entries 
>> of
>> +the bitmap directory have variable length, depending on the length of the
>> +bitmap name.
>> +
>> +Dirty Bitmap Directory Entry:
>> +
>> +    Byte 0 -  7:    dirty_bitmap_table_offset
>> +                    Offset into the image file at which the Dirty Bitmap 
>> Table
>> +                    for the bitmap starts. Must be aligned to a cluster
>> +                    boundary.
>> +
>> +         8 - 15:    nb_virtual_bits
>> +                    Number of "virtual" bits in the bitmap. Number of
>> +                    "physical" bits would be:
>> +                    (nb_virtual_bits + granularity - 1) / granularity
>> +
>> +        16 - 19:    dirty_bitmap_table_size
>> +                    Number of entries in the Dirty Bitmap Table of the 
>> bitmap.
>> +                    Valid values: 0 - 0x8000000.
>> +                    Also, (dirty_bitmap_table_size * cluster_size) should 
>> not
>> +                    be greater than 0x20000000 (512 MB)
>> +
>> +        20 - 23:    granularity_bits
>> +                    Granularity bits. Valid values are: 0 - 63.
>> +
>> +                    Granularity is calculated as
>> +                        granularity = 1 << granularity_bits
> 
> 63 seems like a rather high limit.  Even 32 (1 bit covering 4 billion
> clusters) is huge.
> 

Sectors, but yes. g=32 would mean 1 bit covers 2TiB with
BDRV_SECTOR_BITS := 9.

bdrv_create_dirty_bitmap allows granularity to be specified in bytes as
a uint32_t. `g` or granularity /bits/ is ultimately calculated as:
g = log_2(sectors_per_bit = (granularity_bytes / (sector_size = 512)))

Since bytes is capped to 2^32-1 and we ensure that it is a value of 2,
the largest we can specify is 2^31, which would produce for us a `g` of 22.

>> +
>> +                    Granularity of the bitmap is how many "virtual" bits
>> +                    accounts for one "physical" bit.
>> +
>> +        24 - 27:    flags
>> +                    Bit
>> +                      0: in_use
>> +                         The bitmap is in use and may be inconsistent.
>> +
>> +                      1: self
>> +                         The bitmap is a dirty bitmap for the containing 
>> image.
>> +
>> +                      2: auto
>> +                         The bitmap should be autoloaded as block dirty 
>> bitmap.
>> +                         Only available if bit 1 (self) is set.
>> +
>> +                      3: read_only
>> +                         The bitmap should not be rewritten.
>> +
>> +                    Bits 4 - 31 are reserved.
>> +
>> +        28 - 29:    name_size
>> +                    Size of the bitmap name. Valid values: 0 - 1023.
> 
> This limit is inconsistent with the limit above that a directory entry
> will be at most 1024 bytes; since you have already burned 30 bytes on
> essential information.  Is 0 allowed (a bitmap with "" as its name), or
> must the name_size be at least 1?  Should you document constraints that
> each bitmap name should be unique within the file (that is, no two
> bitmaps in the directory have the same name)?
> 

To documenting the namespace: yes please!

To zero-length bitmap names: I am not sure what use we have for them --
If more than one empty-name bitmap is stored, they will become ambiguous
and difficult to retrieve meaningfully.

I am of the opinion that each bitmap should have some unique identifier
attached to it. To this end, allowing an empty string for *one* bitmap
is not a problem, but may confuse future developers into thinking it's
okay to have "un-named" bitmaps. It isn't.

I'm for prohibiting the empty bitmap name.

>> +
>> +        variable:   The name of the bitmap (not null terminated).
>> +
>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry 
>> size to
>> +                    the next multiple of 8.
>>
> 

I will revisit this patch after reviewing the rest of the series for
further comments, if any.

Thanks, and sorry it took me so long to get to reviewing this again.

--js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]