qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/5] RFC: Efficient VM backup for qemu (v1)


From: Dietmar Maurer
Subject: Re: [Qemu-devel] [PATCH 1/5] RFC: Efficient VM backup for qemu (v1)
Date: Wed, 21 Nov 2012 11:10:07 +0000

> > +Note: It turned out that taking a qcow2 snapshot can take a very long
> > +time on larger files.
> 
> Hm, really? What are "larger files"? It has always been relatively quick when 
> I
> tested it, though internal snapshots are not my focus, so that need not mean
> much.

300GB or larger
 
> If this is really an important use case for someone, I think qcow2 internal
> snapshots still have some potential for relatively easy performance
> optimisations.

I guess the problem is the small cluster size, so the reference table gets 
quite large
(for example fvd uses 2GB to minimize table size).
 
> But that just as an aside...
> 
> > +
> > +=Make it more efficient=
> > +
> > +The be more efficient, we simply need to avoid unnecessary steps. The
> > +following steps are always required:
> > +
> > +1.) read old data before it gets overwritten
> > +2.) write that data into the backup archive
> > +3.) write new data (VM write)
> > +
> > +As you can see, this involves only one read, an two writes.
> 
> Looks like a nice approach to backup indeed.
> 
> The question is how to fit this into the big picture of qemu's live block
> operations. Much of it looks like an active mirror (which is still to be
> implemented), with the difference that it doesn't write the new, but the old
> data, and that it keeps a bitmap of clusters that should not be mirrored.
> 
> I'm not sure if this means that code should be shared between these two or
> if the differences are too big. However, both of them have things in common
> regarding the design. For example, both have a background part (copying the
> existing data) and an active part (mirroring/backing up data on writes). Block
> jobs are the right tool for the background part.

I already use block jobs. Or do you want to share more?
 
> The active part is a bit more tricky. You're putting some code into block.c to
> achieve it, which is kind of ugly. 

yes. but I tried to keep that small ;-)

>We have been talking about "block filters"
> previously that would provide a generic infrastructure, and at least in the 
> mid
> term the additions to block.c must disappear.
> (Same for block.h and block_int.h - keep things as separated from the core as
> possible) Maybe we should introduce this infrastructure now.

I have no idea what you talk about? Can you point me to the relevant discussion?
 
> Another interesting point is how (or whether) to link block jobs with block
> filters. I think when the job is started, the filter should be inserted
> automatically, and when you cancel it, it should be stopped.
> When you pause the job... no idea. :-)
> 
> > +
> > +To make that work, our backup archive need to be able to store image
> > +data 'out of order'. It is important to notice that this will not
> > +work with traditional archive formats like tar.
> 
> > +* works on any storage type and image format.
> > +* we can define a new and simple archive format, which is able to
> > +  store sparse files efficiently.
> 
> > +
> > +Note: Storing sparse files is a mess with existing archive formats.
> > +For example, tar requires information about holes at the beginning of
> > +the archive.
> 
> > +* we need to define a new archive format
> > +
> > +Note: Most existing archive formats are optimized to store small
> > +files including file attributes. We simply do not need that for VM 
> > archives.
> > +
> > +* archive contains data 'out of order'
> > +
> > +If you want to access image data in sequential order, you need to
> > +re-order archive data. It would be possible to to that on the fly,
> > +using temporary files.
> > +
> > +Fortunately, a normal restore/extract works perfectly with 'out of
> > +order' data, because the target files are seekable.
> 
> > +=Archive format requirements=
> > +
> > +The basic requirement for such new format is that we can store image
> > +date 'out of order'. It is also very likely that we have less than
> > +256 drives/images per VM, and we want to be able to store VM
> > +configuration files.
> > +
> > +We have defined a very simply format with those properties, see:
> > +
> > +docs/specs/vma_spec.txt
> > +
> > +Please let us know if you know an existing format which provides the
> > +same functionality.
> 
> Essentially, what you need is an image format. You want to be independent
> from the source image formats, but you're okay with using a specific format
> for the backup (or you wouldn't have defined a new format for it).
> 
> The one special thing that you need is storing multiple images in one file.
> There's something like this already in qemu: qcow2 with its internal
> snapshots is basically a flat file system.
> 
> Not saying that this is necessarily the best option, but I think reusing 
> existing
> formats and implementation is always a good thing, so it's an idea to
> consider.

AFAIK qcow2 file cannot store data out of order. In general, an backup fd is 
not seekable, 
and we only want to do sequential writes. Image format always requires seekable 
fds?

Anyways, a qcow2 file is really complex beast - I am quite unsure if I would 
use 
that for backup if it is possible. 

That would require any external tool to include >=50000 LOC

The vma reader code is about 700 LOC (quite easy).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]