qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3] doc: Add NBD_CMD_BLOCK_STATUS extension


From: Alex Bligh
Subject: Re: [Qemu-devel] [PATCH v3] doc: Add NBD_CMD_BLOCK_STATUS extension
Date: Fri, 2 Dec 2016 18:45:38 +0000

John,

>> +Some storage formats and operations over such formats express a
>> +concept of data dirtiness. Whether the operation is block device
>> +mirroring, incremental block device backup or any other operation with
>> +a concept of data dirtiness, they all share a need to provide a list
>> +of ranges that this particular operation treats as dirty.
>> 
>> How can data be 'dirty' if it is static and unchangeable? (I thought)
>> 
> 
> In a simple case, live IO goes to e.g. hda.qcow2. These writes come from
> the VM and cause the bitmap that QEMU manages to become dirty.
> 
> We intend to expose the ability to fleece dirty blocks via NBD. What
> happens in this scenario would be that a snapshot of the data at the
> time of the request is exported over NBD in a read-only manner.
> 
> In this way, the drive itself is R/W, but the "view" of it from NBD is
> RO. While a hypothetical backup client is busy copying data out of this
> temporary view, new writes are coming in to the drive, but are not being
> exposed through the NBD export.
> 
> (This goes into QEMU-specifics, but those new writes are dirtying a
> version of the bitmap not intended to be exposed via the NBD channel.
> NBD gets effectively a snapshot of both the bitmap AND the data.)

Thanks. That makes sense - or enough sense for me to carry on commenting!

>> I now think what you are talking about backing up a *snapshot* of a disk
>> that's running, where the disk itself was not connected using NBD? IE it's
>> not being 'made dirty' by NBD_CMD_WRITE etc. Rather 'dirtiness' is 
>> effectively
>> an opaque state represented in a bitmap, which is binary metadata
>> at some particular level of granularity. It might as well be 'happiness'
>> or 'is coloured blue'. The NBD server would (normally) have no way of
>> manipulating this bitmap.
>> 
>> In previous comments, I said 'how come we can set the dirty bit through
>> writes but can't clear it?'. This (my statement) is now I think wrong,
>> as NBD_CMD_WRITE etc. is not defined to set the dirty bit. The
>> state of the bitmap comes from whatever sets the bitmap which is outside
>> the scope of this protocol to transmit it.
>> 
> 
> You know, this is a fair point. We have not (to my knowledge) yet
> carefully considered the exact bitmap management scenario when NBD is
> involved in retrieving dirty blocks.
> 
> Humor me for a moment while I talk about a (completely hypothetical, not
> yet fully discussed) workflow for how I envision this feature.
> 
> (1) User sets up a drive in QEMU, a bitmap is initialized, an initial
> backup is made, etc.
> 
> (2) As writes come in, QEMU's bitmap is dirtied.
> 
> (3) The user decides they want to root around to see what data has
> changed and would like to use NBD to do so, in contrast to QEMU's own
> facilities for dumping dirty blocks.
> 
> (4) A command is issued that creates a temporary, lightweight snapshot
> ('fleecing') and exports this snapshot over NBD. The bitmap is
> associated with the NBD export at this point at NBD server startup. (For
> the sake of QEMU discussion, maybe this command is "blockdev-fleece")
> 
> (5) At this moment, the snapshot is static and represents the data at
> the time the NBD server was started. The bitmap is also forked and
> represents only this snapshot. The live data and bitmap continue to change.
> 
> (6) Dirty blocks are queried and copied out via NBD.
> 
> (7) The user closes the NBD instance upon completion of their task,
> whatever it was. (Making a new incremental backup? Just taking a peek at
> some changed data? who knows.)
> 
> The point that's interesting here is what do we do with the two bitmaps
> at this point? The data delta can be discarded (this was after all just
> a lightweight read-only point-in-time snapshot) but the bitmap data
> needs to be dealt with.
> 
> (A) In the case of "User made a new incremental backup," the bitmap that
> got forked off to serve the NBD read should be discarded.
> 
> (B) In the case of "User just wanted to look around," the bitmap should
> be merged back into the bitmap it was forked from.
> 
> I don't advise a hybrid where "User copied some data, but not all" where
> we need to partially clear *and* merge, but conceivably this could
> happen, because the things we don't want to happen always will.
> 
> At this point maybe it's becoming obvious that actually it would be very
> prudent to allow the NBD client itself to inform QEMU via the NBD
> protocol which extents/blocks/(etc) that it is "done" with.
> 
> Maybe it *would* actually be useful if, in NBD allowing us to add a
> "dirty" bit to the specification, we allow users to clear those bits.
> 
> Then, whether the user was trying to do (A) or (B) or the unspeakable
> amalgamation of both things, it's up to the user to clear the bits
> desired and QEMU can do the simple task of simply always merging the
> bitmap fork upon the conclusion of the NBD fleecing exercise.
> 
> Maybe this would allow the dirty bit to have a bit more concrete meaning
> for the NBD spec: "The bit stays dirty until the user clears it, and is
> set when the matching block/extent/etc is written to."
> 
> With an exception that external management may cause the bits to clear.
> (I.e., someone fiddles with the backing store in a way opaque to NBD,
> e.g. someone clears the bitmap directly through QEMU instead of via NBD.)

There is currently one possible "I've done with the entire bitmap"
signal, which is closing the connection. This has two obvious
problems. Firstly if used, it discards the entire bitmap (not bits).
Secondly, it makes recovery from a broken TCP session difficult
(as either you treat a dirty close as meaning the bitmap needs
to hang around, in which case you have a garbage collection issue,
or you treat it as needing to drop the bitmap, in which case you
can't recover).

I think in your plan the block status doesn't change once the bitmap
is forked. In that case, adding some command (optional) to change
the status of the bitmap (or simply to set a given extent to status X)
would be reasonable. Of course whether it's supported could be dependent
on the bitmap.

> Having missed most of the discussion on v1/v2, is it a given that we
> want in-band identification of bitmaps?
> 
> I guess this might depend very heavily on the nature of the definition
> of the "dirty bit" in the NBD spec.

I don't think it's a given. I think Wouter & I came up with it at
the same time as a way to abstract the bitmap/extent concept and
remove the need to specify a dirty bit at all (well, that's my excuse
anyway).

> Anyway, I hope I am being useful and just not more confounding. It seems
> to me that we're having difficulty conveying precisely what it is we're
> trying to accomplish, so I hope that I am making a good effort in
> elaborating on our goals/requirements.

Yes absolutely. I think part of the challenge is that you are quite
reasonably coming at it from the point of view of qemu's particular
need, and I'm coming at it from 'what should the nbd protocol look
like in general' position, having done lots of work on the protocol
docs (though I'm an occasional qemu contributor). So there's necessarily
a gap of approach to be bridged.

I'm overdue on a review of Wouter's latest patch (partly because I need
to re-diff it against the version with no NBD_CMD_BLOCK_STATUS in),
but I think it's a bridge worth building.

-- 
Alex Bligh







reply via email to

[Prev in Thread] Current Thread [Next in Thread]