qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] qemu snapshot enchancement


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [RFC] qemu snapshot enchancement
Date: Tue, 29 Jan 2013 14:27:10 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Jan 29, 2013 at 10:58:56AM +0800, Wenchao Xia wrote:
> 于 2013-1-28 21:00, Stefan Hajnoczi 写道:
> >On Fri, Jan 25, 2013 at 05:16:46PM +0800, Wenchao Xia wrote:
> >>于 2013-1-24 17:47, Stefan Hajnoczi 写道:
> >>>
> >>>>>Case 3:
> >>>>>
> >>>>>  * What does "blank data" mean?  Besides that the use case
> >>>>>    makes sense.
> >>>>>
> >>>>   Will remove the words.
> >>>>
> >>>>>  * When discussing this use case in the past it was suggested that the
> >>>>>    guest doesn't need to be paused during the LVM snapshot.  Instead the
> >>>>>    QEMU block layer might be able to queue I/O requests, allowing the
> >>>>>    guest to run.
> >>>>>
> >>>>   That is a good idea, but seems need more work(event, block layer...),
> >>>>hope it can be added as an enchancement of this case. Now let the
> >>>>dedicated storage software/hardware take the job by pausing for a while
> >>>>(<200ms?)
> >>>
> >>>Yes, allowing the guest to continue but queuing I/O will require extra
> >>>block layer work and maybe a QMP command.  There is a also a risk: if
> >>>the snapshot takes too long to complete, the guest may notice that its
> >>>I/O request are taking a long time.  It may decide that they have timed
> >>>out and report an error to the application or in the message logs.
> >>>
> >>>In the beginning it's easier to pause the VM but let's keep queuing I/O
> >>>in mind so it can be added later, if necessary.
> >>>
> >>   Yep, the code should leave a room for queuing.
> >>
> >>   I have updated the wiki, which added the step details of the cases
> >>. Cases 3 are fixed, which have best performance in qemu management
> >>type. But I am not sure if it is workable in theory to export base
> >>data of an qcow2 image having internal snapshot, what do you think
> >>of it?
> >
> >Yes, it is theoretically possible to access snapshot data while the
> >guest is running.
> >
> >Open the qcow2 read-only and use bdrv_snapshot_load_tmp() to activate
> >the snapshot without modifying the qcow2 file on disk.  This is pretty
> >easy to implement today by adding options to qemu-nbd or the in-process
> >NBD server QMP commands.
> >
> >Stefan
> >
>   Hi, Stefan, thank u for the information. Still I want to show more
> about the steps needed in incremental backup:
> 1 vm have disk.qcow2.
> 2 take internal snapshot snap1.
> 3 export snap1 as base to incremental backup server.
> 4 take internal snapshot snap2.
> 5 export snap2 as delta to server, note snap2 must be delta data with
> snap1, not full data.
> 6 delete snap1.
> repeat 4-6...
> 
> the VM must keeps running.
> 
>   It seems in step 3, base can be load by bdrv_snapshot_load_tmp(),
> but a problem that disk's top content can't be used by VM at that
> time.

The VM can access the disk if you open a *new* read-only
BlockDriverState.  Don't touch the existing BlockDriverState that is in
use by the guest.

Doing this is slightly risky - for example, what happens if the internal
snapshot is deleted while the read-only BDS is still in use?  But it's
not fundamentally unreliable, we just need a policy in QEMU or the QMP
client (libvirt) to behave reasonably.

> and in step 5, may need export the delta data, not the whole disk
> data.

NBD doesn't have a way to perform bdrv_is_allocated().  Either we need
to enhance the protocol or we need to add a QMP command to read
the allocation bitmap for an image.  I'm a little hesitant about sending
the bitmap or allocation extent information over QMP (JSON) but it might
be doable.

Then the backup host knows which regions of the snapshot contain new
blocks that need to be backed up.

> Step 6 need make sure it is quick enough in sync API type.

Is qcow2 internal snapshot deletion a very long operation?  There will
be guest downtime but hopefully nothing noticable for normal users
(couple hundred milliseconds).

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]