qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] live block copy/stream/snapshot discussion


From: Kevin Wolf
Subject: Re: [Qemu-devel] live block copy/stream/snapshot discussion
Date: Tue, 12 Jul 2011 10:06:52 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc15 Thunderbird/3.1.10

Am 11.07.2011 18:32, schrieb Marcelo Tosatti:
> On Mon, Jul 11, 2011 at 03:47:15PM +0100, Stefan Hajnoczi wrote:
>> Kevin, Marcelo,
>> I'd like to reach agreement on the QMP/HMP APIs for live block copy
>> and image streaming.  Libvirt has acked the image streaming APIs that
>> Adam proposed and I think they are a good fit for the feature.  I have
>> described that API below for your review (it's exactly what the QED
>> Image Streaming patches provide).
>>
>> Marcelo: Are you happy with this API for live block copy?  Also please
>> take a look at the switch command that I am proposing.
>>
>> Image streaming API
>> ===================
>>
>> For leaf images with copy-on-read semantics, the stream commands allow the 
>> user
>> to populate local blocks by manually streaming them from the backing image.
>> Once all blocks have been streamed, the dependency on the original backing
>> image can be removed.  Therefore, stream commands can be used to implement
>> post-copy live block migration and rapid deployment.
>>
>> The block_stream command can be used to stream a single cluster, to
>> start streaming the entire device, and to cancel an active stream.  It
>> is easiest to allow the block_stream command to manage streaming for the
>> entire device but a managent tool could use single cluster mode to
>> throttle the I/O rate.

As discussed earlier, having the management send requests for each
single cluster doesn't make any sense at all. It wouldn't only throttle
the I/O rate but bring it down to a level that makes it unusable. What
you really want is to allow the management to give us a range (offset +
length) that qemu should stream.

>> The command synopses are as follows:
>>
>> block_stream
>> ------------
>>
>> Copy data from a backing file into a block device.
>>
>> If the optional 'all' argument is true, this operation is performed in the
>> background until the entire backing file has been copied.  The status of
>> ongoing block_stream operations can be checked with query-block-stream.

Not sure if it's a good idea to use a bool argument to turn a command
into its opposite. I think having a separate command for stopping would
be cleaner. Something for the QMP folks to decide, though.

>> Arguments:
>>
>> - all:    copy entire device (json-bool, optional)
>> - stop:   stop copying to device (json-bool, optional)
>> - device: device name (json-string)
> 
> It must be possible to specify backing file that will be
> active after streaming finishes (data from that file will not 
> be streamed into active file, of course).

Yes, I think the common base image belongs here.

With all = false, where does the streaming begin? Do you have something
like the "current streaming offset" in the state of each
BlockDriverState? As I said above, I would prefer adding offset and
length to the arguments.

>> Return:
>>
>> - device: device name (json-string)
>> - len:    size of the device, in bytes (json-int)
>> - offset: ending offset of the completed I/O, in bytes (json-int)

So you only get the reply when the request has completed? With the
current monitor, this means that QMP is blocked while we stream, doesn't
it? How are you supposed to send the stop command then?

Two of three examples below have an empty return value instead, so they
are not compliant to this specification.

>> Examples:
>>
>> -> { "execute": "block_stream", "arguments": { "device": "virtio0" } }
>> <- { "return":  { "device": "virtio0", "len": 10737418240, "offset": 512 } }
>>
>> -> { "execute": "block_stream", "arguments": { "all": true, "device":
>> "virtio0" } }
>> <- { "return": {} }
>>
>> -> { "execute": "block_stream", "arguments": { "stop": true, "device":
>> "virtio0" } }
>> <- { "return": {} }
>>
>> query-block-stream
>> ------------------
>>
>> Show progress of ongoing block_stream operations.
>>
>> Return a json-array of all operations.  If no operation is active then an 
>> empty
>> array will be returned.  Each operation is a json-object with the following
>> data:
>>
>> - device: device name (json-string)
>> - len:    size of the device, in bytes (json-int)
>> - offset: ending offset of the completed I/O, in bytes (json-int)
>>
>> Example:
>>
>> -> { "execute": "query-block-stream" }
>> <- { "return":[
>>        { "device": "virtio0", "len": 10737418240, "offset": 709632}
>>     ]
>>   }

When block_stream is changed, this will have to make the same changes.

>> Block device switching API
>> ==========================
>>
>> Extend the 'change' command to support changing the image file without
>> media change notification.
>>
>> Perhaps we should take the opportunity to add a "format" argument for
>> image files?
>>
>> change
>> ------
>>
>> Change a removable medium or VNC configuration.
>>
>> Arguments:
>>
>> - "device": device name (json-string)
>> - "target": filename or item (json-string)
>> - "arg": additional argument (json-string, optional)
>> - "notify": whether to notify guest, defaults to true (json-bool, optional)
>>
>> Examples:
>>
>> 1. Change a removable medium
>>
>> -> { "execute": "change",
>>              "arguments": { "device": "ide1-cd0",
>>                             "target": "/srv/images/Fedora-12-x86_64-DVD.iso" 
>> } }
>> <- { "return": {} }
>>
>> 2. Change a disk without media change notification
>>
>> -> { "execute": "change",
>>              "arguments": { "device": "virtio-blk0",
>>                             "target": "/srv/images/vm_1.img",
>>                             "notify": false } }
>>
>> 3. Change VNC password
>>
>> -> { "execute": "change",
>>              "arguments": { "device": "vnc", "target": "password",
>>                             "arg": "foobar1" } }
>> <- { "return": {} }

I find it rather disturbing that a command like 'change' has made it
into QMP... Anyway, I don't think this is really what we need.

We have two switches to do. The first one happens before starting the
copy: Creating the copy, with the source as its backing file, and
switching to that. The monitor command to achieve this is snapshot_blkdev.

The second switch is after the copy has completed. At this point you can
remove the source as the backing file and use the common base image
instead. This is a call to bdrv_change_backing_file(), for which a
monitor command doesn't exist yet (and unless we want to overload
'change' even more, it's not the right command to do this).

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]