qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specificat


From: John Snow
Subject: Re: [Qemu-devel] [PATCH v4 RFC] spec: add qcow2-dirty-bitmaps specification
Date: Mon, 14 Dec 2015 17:26:24 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0


On 12/14/2015 04:44 PM, Max Reitz wrote:
> On 14.12.2015 21:45, John Snow wrote:
>>
>>
>> On 12/14/2015 03:05 PM, Max Reitz wrote:
>>> On 14.12.2015 18:43, Vladimir Sementsov-Ogievskiy wrote:
>>>> The new feature for qcow2: storing dirty bitmaps.
>>>>
>>>> Only dirty bitmaps relative to this qcow2 image should be stored in it.
>>>>
>>>> Strings started from +# are RFC-strings, not to be commited of course.
>>>>
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>>> ---
>>>>
>>>>  docs/specs/qcow2.txt | 151 
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>  1 file changed, 150 insertions(+), 1 deletion(-)
>>>
>>> Overall: Looks better to me. Good enough for me to ACK it, but I still
>>> have some issues with it.
>>>
>>> Let's evaluate the main point of critique I had: I really want this not
>>> to be qemu-specific but potentially useful to all programs.
>>>
>>> Pretty good: You do implicitly describe what a (dirty) bitmap looks like
>>> by describing how to obtain the bit offset of a certain byte guest
>>> offset. So it's not an opaque binary data dump anymore.
>>>
>>> (Why only "pretty good"? I find the description to be a bit too
>>> "implicit", I think a separate section describing the bitmap structure
>>> would be better.)
>>>
>>> Good: The bitmap actually describes the qcow2 file.
>>>
>>> Not so good: While now any program knows how to read the bitmap and that
>>> it does refer to this qcow2 file, it's interpretation is not so easy
>>> still. Generally, a dirty bitmap has some reference point, that is the
>>> state of the disk when the bitmap was cleared or created. For instance,
>>> for incremental backups, whenever you create a backup based on a dirty
>>> bitmap, the dirty bitmap is cleared and the backup target is then said
>>> reference point.
>>> I think it would be nice to put that reference point (i.e. the name of
>>> an image file that contains the clean image) into the dirty bitmap
>>> header, if possible.
>>>
>>
>> This starts to get a little spooky, because QEMU doesn't necessarily
>> know where (or what) the reference is. QEMU doesn't even know where the
>> last incremental is.
>>
>> It might be hard to store something meaningful here.
> 
> Yes, I thought as much, that's where the "if possible" comes in.
> 
> I don't think it would hurt to include the field even if we're unsure
> how much use we can make of it. If we can store something nice there,
> great! If we can't, too bad.
> 
>> I suppose the management application could do it itself if it wants to.
> 
> Considering the qcow2 image is closed by qemu anyway just after the
> bitmaps are written into it, this might indeed be possible. But I
> wouldn't count on it.
> 
>>>
>>> (Note: I won't comment on orthography, because I feel like that is
>>> something a native speaker should do. O:-))
>>>
>>>> diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
>>>> index 121dfc8..3c89580 100644
>>>> --- a/docs/specs/qcow2.txt
>>>> +++ b/docs/specs/qcow2.txt
> 
> [...]
> 
>>>> +# Let's be strict, the feature should be deleted with deleting last 
>>>> bitmap.
>>>> +
>>>> +          4 -  7:  dirty_bitmap_directory_size
>>>> +                   Size of the Dirty Bitmap Directory in bytes. It should 
>>>> be
>>>> +                   equal to sum of sizes of all (nb_dirty_bitmaps) dirty 
>>>> bitmap
>>>> +                   headers.
>>>
>>> No, it "should" not be equal, it *must* be equal. But I think you can
>>> just omit that last sentence, that would be just as fine.
>>>
>>
>> It's informative, though. Just another clarifying detail that the bitmap
>> directory is the collection of all the dirty bitmap headers.
>>
>> Replacing "should" with "must" is sufficient.
> 
> Of course.
> 
> [...]
> 
>>>> +=== Dirty Bitmap Directory ===
> 
> [...]
> 
>>>> +        12 - 15:    flags
>>>> +                    Bit
>>>> +                      0: in_use
>>>> +                         The bitmap was not saved correctly and may be
>>>> +                         inconsistent.
>>>> +
>>>> +                      1: auto
>>>> +                         The bitmap should be autoloaded as block dirty 
>>>> bitmap
>>>> +                         and tracking should be started. Type of the 
>>>> bitmap
>>>> +                         should be 'Dirty Tracking Bitmap'.
>>>
>>> I find the wording a bit too qemu-specific. How about this:
>>>
>>> This bitmap is the default dirty bitmap for the virtual disk represented
>>> by this qcow2 image. It should track all write accesses immediately
>>> after the image has been opened.
>>>
>>
>> Closer; though we can certainly have more than one, so "a" may not be
>> appropriate.
>>
>> "This bitmap is a dirty tracking bitmap for the virtual disk represented
>> by this qcow2 image."
> 
> Hm, aren't all dirty tracking bitmaps of this image... Well, dirty
> tracking bitmaps of this image?
> 

Tch, sorry. The hyper-generalization is setting in again. Yes, of
course. I was trying to just remove "the" in favor of "a/an."

> We currently don't have anything but dirty tracking bitmaps, and the
> flag actually is explicitly valid only for that kind of bitmaps (for
> now); also, in this revision, all dirty bitmaps do refer to this very
> qcow2 file anyway.
> 

You're right, though using the explicit language "This bitmap [which is
/definitely/ a dirty tracking type...]" is just to cement the idea.

> Maybe I just interpreted the sentence wrong and put too much weight into
> the first part ("should be autoloaded as block dirty bitmap") so I felt
> the need to translate it to an explicit "is the default".
> 
> So I guess this bit simply means that the bitmap should be active in
> that it tracks new writes?
> 

Yes: the idea is of course that QEMU will see this and know that it
needs to load this bitmap as active. It's useful for the "default"
bitmap, but there's no strict reason why there can't be multiple.

(Why is that useful? In my thoughts, it's for different backup paradigms
that backup to different places on different schedules. Weekly vs.
daily, for instance.)

>> And to avoid "should" again:
>>
>> "All writes to this file must also be represented in this bitmap."
> 
> :-)
> 
> I wasn't sure here. I felt "should" appropriate because maybe you do
> have a very compelling reason not to do so and to let a certain write
> access slip by on purpose.
> 

For the dirty type, I don't think so -- unless you have since decided to
disable the bitmap and discontinue its use (but keep the data, for
whatever reason that may be), but at that point you should have also
toggled off the 'auto' bit, so I don't think so.

> So this is up to you, if it's "must", then it's "must". If it's
> "should", then it's "should".
> 

If the auto bit is on, it's a must.

Let's try again:

"1: auto
    This bitmap is of the Dirty Tracking Type and must accurately
    reflect all writes to the virtual disk image by any application
    that would write to it. This bitmap should not lose the auto bit
    except by user intervention."

Or something like that? The idea is that it will indeed be "autoloaded"
and made active, but some bitmaps might be stored in various
frozen/unactive states where this is not desirable.

>>> And I find the "should" in "Type of the bitmap should be..." a bit too
>>> weak. Maybe "This flag is valid only for Dirty Tracking Bitmaps" is better.
>>>
>>
>> Sure.
>>
> 
> [...]
> 
>>>> +# This is mostly for error checking and information in qemu-img info 
>>>> output.
>>>> +# The other types may be, for example, "Backup Bitmap" - to make it 
>>>> possible
>>>> +# stop backup job on vm stop and resume it later. The another one is 
>>>> "Sector
>>>> +# Alloction Bitmap" (Fam, John, please comment).
>>>
>>> I'm waiting for their comments because that sounds like "refcount table
>>> with refcount_bits=1" to me. :-)
>>>
>>
>> The idea is that we may allow for bitmaps to store other kinds of
>> information, like "allocated in this layer." That information is not
>> necessarily useful to qcow2, but it might be for other image formats. If
>> we ever do add such subtypes, we can always add a new reserved entry:
>>
>> # 1: Reserved - Invalid for qcow2
>> # 2: Backup Progress Bitmap ...
>> # 3-255: Reserved
>>
>> Backup progress bitmaps may indeed be useful and sane information to
>> store in a qcow2, though.
> 
> OK. I was wondering because I want to (again) make sure that these other
> bitmaps are not plain dumps of some obscure bitmap qemu has, but that we
> can imagine bitmaps that are generally useful.
> 
> First: OK, it doesn't really matter. Having this field definitely is
> good, even if we wouldn't actually use it.
> 
> <offtopic reason="This series is about dirty bitmaps and not about other
> types of bitmaps">
> Now, my thoughts about backup progress bitmaps are the following:
> 
> If we put the backup progress into the source file, that's a bit
> strange. You aren't modifying this file, so why would you put it here?
> Also, no program but the one doing the backup can do anything with that
> bitmap. Others will just say "Nice to know how far you're in your
> backup, but, well, am I supposed do something about it?"
> 

You're not wrong.

> Instead, we could put it into the target. That would be nicer, because
> then you open the target and you can see "Oh, there's a backup progress
> bitmap! I guess I'll need to copy these clusters from the backup source
> [given in the backup bitmap header]."
> 

Sure! I had not thought through the feature in any great detail -- we're
only generally aware that we may want to store more than one kind of
bitmap in the future.

There's even an RFE for this exact type of bitmap, complete with a
hilariously wrong estimate for when I'd have this done by:

https://bugzilla.redhat.com/show_bug.cgi?id=905123

> Strictly speaking, the same can be achived by combining the dirty bitmap
> used for the backup (in case of an incremental backup) and the refcount
> information for the backup target. However, finding the right dirty
> bitmap may be a bit cumbersome, so I do think having such a backup
> progress bitmap (in the target image) is justified.
> </offtopic>
> 
> (If you've already decided to put the backup progress into the target
> image, you may find that offtopic block not very helpful; however, it
> may help you get an idea of when I consider some information to be
> beneficial to a qcow2 file.)
> 

<Offtopic>
There's no code written specifically for this yet, so we're still OK. It
could reasonably go into either the source or destination:

Destination: "Here are the sectors I still need" is reasonable, as then
you can easily go and fetch the missing bits you need.

However, if the source was modified after the bitmap was written to the
destination, we have no way of knowing if anything has been updated
since we started the backup.

If we store it in the source: "Here's what's changed [and what we never
copied] since we started that backup," we can even make offline writes
to the source after we stop the backup, then resume it later and still
have it work out.

In the destination model, we point to the source for where to get the
rest of the data. In the source model, we point to the destination of
our unfinished backup.

One definitely feels more independently useful: "Here's how to get the
rest of my data" vs "I was being used for an operation you may or may
not care about.", though the less meaningful one might actually be
easier and more flexible to maintain from QEMU's standpoint.

Offtopic indeed.

</Offtopic>

The only real point is that there *might* be different semantics in the
future, but we may also be able to get away with overloading the
existing type entirely.

The reserved type enum will work out OK for this.

>> Fam may have more opinions as he's been working on this area of thought
>> recently.
>>
>>>> +             19:    granularity_bits
>>>> +                    Granularity bits. Valid values are: 0 - 31.
>>>> +# Now, qemu allows creating bitmaps with granularity as a 32bit value. And
>>>> +# there are no reasons of increasing it.
>>>
>>> Good (implicit) question. I can't imagine any reason for wanting to have
>>> a coarser granularity than 2 GB, but I do think there may be a need in
>>> the future for some people.
>>>
>>> Once again, I think we should discriminate between what is generally a
>>> useful limitation and what is simply due to qemu not supporting anything
>>> else right now.
>>>
>>> Thus, I think it would be better to increase the range to 0 - 63 and
>>> make a note that qemu only supports values up to 31 right now.
>>>
>>
>> I suppose that won't hurt anything.
>>
>> (I look forward to the future where the hard drives are so big and the
>> network bandwidth so bountiful that 2GB granularity is seen as "too
>> fine-grained!")
> 
> 640k... ;-)
> 
> [...]
> 

"I look forward to the future" wasn't sarcasm as much as it was an
honest statement. :)

>>>> +
>>>> +        variable:   The name of the bitmap (not null terminated). Should 
>>>> be
>>>> +                    unique among all dirty bitmap names within the Dirty
>>>> +                    bitmaps extension.
>>>> +
>>>> +        variable:   Padding to round up the Dirty Bitmap Directory Entry 
>>>> size
>>>> +                    to the next multiple of 8.
>>>
>>> What I'd like here is variable additional information based on the
>>> bitmap type. For some types, this may be absolutely necessary; for dirty
>>> tracking bitmaps it depends on what we do about the reference point thing.
>>>
>>> The reference point thing is the following: As mentioned at the top, I'd
>>> like there to be some kind of description of what the clean state was.
>>> As far as I know, this is generally a backup in the form of a file. In
>>> that case, we could put that filename here.
>>>
>>
>> We may also have exported that backup to an NBD server and we're not
>> sure (on the local end) where that data is anymore, though.
>>
>> For local utility usage, when a reference is possible, we might be able
>> to list it as an optional nice thing, but I think requiring it might be
>> difficult.
> 
> You're right, we don't need to require it.
> 
>>> I don't think not having a reference point description is a serious show
>>> stopper. qemu itself does rely on the management layer to know which
>>> bitmap to use when. But I think it would be pretty nice to have it here.
>>>
>>
>> I'm not opposed to listing some "nice" information when available.
>>
>> last_backup /path/to/incremental.5.qcow2
>> base_backup /path/to/reference.qcow2
>>
> 
> I don't think we need the base_backup since you can get that by walking
> through the backing chain of the reference point (the backup target),
> but it probably won't hurt if you can make a good general semantic
> connection to the dirty bitmap.
> 
> Max
> 

Yes, probably not. How would you suggest this be implemented, also? Does
the bitmap object need to begin tracking a last-known backup
filename/identifier? I remembered you didn't seem too happy about
keeping filenames in memory for QEMU, but we currently don't track this
information at all. Answer has so far been "It's up to management!"

--js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]