Re: [Qemu-devel] kvm / virsh snapshot management

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] kvm / virsh snapshot management

From:	Gary Dale
Subject:	Re: [Qemu-devel] kvm / virsh snapshot management
Date:	Mon, 10 Jun 2019 17:27:39 -0400
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0

On 2019-06-10 11:54 a.m., Gary Dale wrote:

On 2019-06-10 8:19 a.m., Stefan Hajnoczi wrote:
On Sat, Jun 01, 2019 at 08:12:01PM -0400, Gary Dale wrote:
A while back I converted a raw disk image to qcow2 to be able to use
snapshots. However I realize that I may not really understandexactly howsnapshots work. In this particular case, I'm only talking aboutinternalsnapshots currently as there seems to be some differences of opinionas towhether internal or external are safer/more reliable. I'm also onlytalking
about shutdown state snapshots, so it should just be the disk that is
snapshotted.
As I understand it, the first snapshot freezes the base image andsubsequentchanges in the virtual machine's disk are stored elsewhere in theqcow2 file
(remember, only internal snapshots). If I take a second snapshot, that
freezes the first one, and subsequent changes are now in thirdlocation.Each new snapshot is incremental to the one that preceded it ratherthandifferential to the base image. Each new snapshot is a child of theprevious
one.
Internal snapshots are not incremental or differential at the qcow2
level, they are simply a separate L1/L2 table pointing to data clusters.
In other words, they are an independent set of metadata showing the full
state of the image at the point of the snapshot.  qcow2 does not track
relationships between snapshots and parents/children.
Which sounds to me like they are incremental. Each snapshot starts anew L1/L2 table so that the state of the previous one is preserved.
One explanation I've seen of the process is if I delete a snapshot, the
changes it contains are merged with its immediate child.
Nope.  Deleting a snapshot decrements the reference count on all its
data clusters.  If a data cluster's reference count reaches zero it will
be freed.  That's all, there is no additional data movement or
reorganization aside from this.
Perhaps not physically but logically it would appear that the dataclusters were merged.
So if I deleted the
first snapshot, the base image stays the same but any data that haschangedsince the base image is now in the second snapshot's location. Themergewith children explanation also implies that the base image is nevertouched
even if the first snapshot is deleted.
But if I delete a snapshot that has no children, is that essentiallythesame as reverting to the point that snapshot was created and allsubsequentdisk changes are lost? Or does it merge down to the parent snapshot?If I
delete all snapshots, would that revert to the base image?
No.  qcow2 has the concept of the current disk state of the running VM -
what you get when you boot the guest - and the snapshots - they are
read-only.

When you delete snapshots the current disk state (running VM) is
unaffected.

When you apply a snapshot this throws away the current disk state and
uses the snapshot as the new current disk state.  The read-only snapshot
itself is not modified in any way and you can apply the same snapshot
again as many times as you wish later.
So in essence the current state is a pointer to the latest datacluster, which is the only data cluster that can be modified.
I've seen it explained that a snapshot is very much like a timestamp so
deleting a timestamp removes the dividing line between writes thatoccurredbefore and after that time, so that data is really only removed if Irevert
to some time stamp - all writes after that point are discarded. In this
explanation, deleting the oldest timestamp is essentially updatingthe base
image. Deleting all snapshots would leave me with the base image fully
updated.
Frankly, the second explanation sounds more reasonable to me,without havingto figure out how copy-on-write works, But I'm dealing withimportant data
here and I don't want to mess it up by mishandling the snapshots.

Can some provide a little clarity on this? Thanks!
If you want an analogy then git(1) is a pretty good one.  qcow2 internal
snapshots are like git tags.  Unlike branches, tags are immutable.  In
qcow2 you only have a master branch (the current disk state) from which
you can create a new tag or you can use git-checkout(1) to apply a
snapshot (discarding whatever your current disk state is).

Stefan
That's just making things less clear - I've never tried to understandgit either. Thanks for the attempt though.
If I've gotten things correct, once the base image is established,there is a current disk state that points to a table containing allthe writes since the base image. Creating a snapshot essentially takesthat pointer and gives it the snapshot name, while creating a newcurrent disk state pointer and data table where subsequent writes arerecorded.
Deleting snapshots removes your ability to refer to a data table byname, but the table itself still exists anonymously as part of a chainof data tables between the base image and the current state.
This leaves a problem. The chain will very quickly get quite longwhich will impact performance. To combat this, you can use blockcommitto merge a child with its parent or blockpull to merge a parent withits child.
In my situation, I want to keep a week of daily snapshots in casesomething goes horribly wrong with the VM (I recently had a databasefile become corrupt, and reverting to the previous working day's imagewould have been a quick and easy solution, faster than recovering allthe data tables from the prefious day). I've been shutting down theVM, deleting the oldest snapshot and creating a new one beforerestarting the VM.
While your explanation confirms that this is safe, it also impliesthat I need to manage the data table chains. My first instinct is touse blockcommit before deleting the oldest snapshot, such as:
virsh blockcommit <vm name> <qcow2 file path> --top <oldestsnapshot> --delete --wait virsh snapshot-delete --domain <vm name> --snapshotname <oldestsnapshot>
so that the base image contains the state as of one week earlier andthe snapshot chains are limited to 7 links.
1) does this sound reasonable?
2) I note that the syntax in virsh man page is different from thesyntax athttps://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_deployment_and_administration_guide/sect-backing-chain(RedHat uses --top and --base while the man page just has optionalbase and top names). I believe the RedHat guide is correct because theman page doesn't allow distinguishing between the base and the top fora commit.
However the need for specifying the path isn't obvious to me. Isn'tthe path contained in the VM definition?
Since blockcommit would make it impossible for me to revert to anearlier state (because I'm committing the oldest snapshot, if itscrews up, I can't undo within virsh), I need to make sure thiscommand is correct.

Trying this against a test VM, I ran into a roadblock. My command lineand the results are:

# virsh blockcommit stretch "/home/secure/virtual/stretch.qcow2" --topstretchS3 --delete --wait

error: unsupported flags (0x2) in function qemuDomainBlockCommit

I get the same thing when the path to the qcow2 file isn't quoted.

I noted inhttps://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/virtualization_administration_guide/sub-sect-domain_commands-using_blockcommit_to_shorten_a_backing_chainthat the options use a single "-". However the results for that were:# virsh blockcommit stretch /home/secure/virtual/stretch.qcow2 -topstretchS3 -delete -waiterror: Scaled numeric value '-top' for <--bandwidth> option is malformedor out of range

which looks like virsh doesn't like the single dashes and is trying tointerpret them as positional options.


I also did a

# virsh domblklist stretch
Target     Source
------------------------------------------------
vda        /home/secure/virtual/stretch.qcow2
hda        -

and tried using vda instead of the full path in the blockcommit but gotthe same error.


Any ideas on what I'm doing wrong?

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] kvm / virsh snapshot management, Stefan Hajnoczi, 2019/06/10
- Re: [Qemu-devel] kvm / virsh snapshot management, Gary Dale, 2019/06/10
  - Re: [Qemu-devel] kvm / virsh snapshot management, Gary Dale <=
    - Re: [Qemu-devel] kvm / virsh snapshot management, Eric Blake, 2019/06/10
    - Re: [Qemu-devel] kvm / virsh snapshot management, Gary Dale, 2019/06/10
    - Re: [Qemu-devel] kvm / virsh snapshot management, Eric Blake, 2019/06/10
    - Re: [Qemu-devel] kvm / virsh snapshot management, Gary Dale, 2019/06/11
  - Re: [Qemu-devel] kvm / virsh snapshot management, Eric Blake, 2019/06/10
    - Re: [Qemu-devel] kvm / virsh snapshot management, Gary Dale, 2019/06/10
    - Re: [Qemu-devel] kvm / virsh snapshot management, Eric Blake, 2019/06/10

Prev by Date: Re: [Qemu-devel] [Qemu-arm] [PATCH 28/42] target/arm: Convert VMOV (imm) to decodetree
Next by Date: Re: [Qemu-devel] [PATCH v21 4/7] target/avr: Add instruction translation
Previous by thread: Re: [Qemu-devel] kvm / virsh snapshot management
Next by thread: Re: [Qemu-devel] kvm / virsh snapshot management
Index(es):
- Date
- Thread