qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QCOW2 deduplication


From: Kevin Wolf
Subject: Re: [Qemu-devel] QCOW2 deduplication
Date: Thu, 28 Feb 2013 11:09:35 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Am 28.02.2013 um 10:59 hat Stefan Hajnoczi geschrieben:
> On Wed, Feb 27, 2013 at 05:40:53PM +0100, Kevin Wolf wrote:
> > Am 27.02.2013 um 16:58 hat Benoît Canet geschrieben:
> > > > > The current prototype of the QCOW2 deduplication uses 32 bytes SHA256 
> > > > > or SKEIN
> > > > > hashes to identify each 4KB clusters with a very low probability of 
> > > > > collisions.
> > > > 
> > > > How do you handle the rare collision cases? Do you read the original
> > > > cluster and compare the exact contents when the hashes match?
> > > 
> > > Stefan found a paper with the math required to compute the collision
> > > probability: http://http://plan9.bell-labs.com/sys/doc/venti/venti.html
> > >              (Section 3.1)
> > > Doing the math for 1 Exabyte of stored data with 4KB clusters and 256 bits
> > > hashes gives a probability of 2.57E-49.
> > > The probability being low enough I plan to code the read/compare as an
> > > option that the users would toggle.
> > > The people who wrote the deduplication in ZFS have done it this way.
> > 
> > Fair enough. If you want to gamble with your data for some more
> > performance, you can turn it off. Should we add some comptaible taint
> > flag after the image has been used without collision detection?
> 
> If the verification setting is stored in the qcow2 image header then
> it's essentially a taint flag.

This assumes that we'll not allow to enable it on the qemu command line,
and that options can never be changed after the image is created. Both
are true today, but both should be changed sooner or later.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]