qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: Strategic decision: COW format


From: Anthony Liguori
Subject: Re: [Qemu-devel] Re: Strategic decision: COW format
Date: Wed, 23 Feb 2011 09:47:50 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Lightning/1.0b1 Thunderbird/3.0.10

On 02/23/2011 09:36 AM, Avi Kivity wrote:
On 02/23/2011 05:29 PM, Anthony Liguori wrote:

existed, what about snapshots?  Are we okay having a feature in a
prominent format that isn't going to meet user's expectations?

Is there any hope that an image with 1000, 1000, or 10000 snapshots is
going to have even reasonable performance in qcow2?
Is there any hope for backing file chains of 1000 files or more? I
haven't tried it out, but in theory I'd expect that internal snapshots
could cope better with it than external ones because internal snapshots
don't have to go through the whole chain all the time.

I don't think there's a user expectation of backing file chains of 1000 files performing well. However, I've talked to a number of customers that have been interested in using internal snapshots for checkpointing which would involve a large number of snapshots.

In fact, Fabrice originally added qcow2 because he was interested in doing reverse debugging. The idea of internal snapshots was to store a high number of checkpoints to allow reverse debugging to be optimized.

I don't see how that works, since the memory image is duplicated for each snapshot. So thousands of snapshots = terabytes of storage, and hours of creating the snapshots.

Fabrice wanted to use CoW to as a mechanism to deduplicate the memory contents with the on-disk state specifically to address this problem. For the longest time, there was a comment in the savevm code along these lines. It might still be there.

I think the lack of on-disk hashes was a critical missing bit to make this feature really work well.

Migrate-to-file with block live migration, or even better, something based on Kemari would be a lot faster.


I think the way snapshot metadata is stored makes this not realistic since they're stored in more or less a linear array. I think to really support a high number of snapshots, you'd want to store a hash with each block that contained a refcount > 1. I think you quickly end up reinventing btrfs though in the process.

Can you elaborate? What's the problem with a linear array of snapshots (say up to 10,000 snapshots)?

Lots of things. The array will start to consume quite a bit of contiguous space as it gets larger which means it needs to be relocated. Deleting a snapshot is a far more expensive operation than it needs to be.

Regards,

Anthony Liguori




reply via email to

[Prev in Thread] Current Thread [Next in Thread]