[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality |
Date: |
Mon, 11 Feb 2013 09:10:57 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Mon, Feb 11, 2013 at 03:50:10AM +0100, Benoît Canet wrote:
> As you can read dedup keep me awake at night.
>
> I still think that there is a need for a deduplication implementation
> that would perform nearly as fast as regular qcow2.
>
> I though about this: http://en.wikipedia.org/wiki/Normal_distribution.
>
> Not all block are equals for deduplication.
> Some will deduplicate well and some won't.
>
> My idea would be to run periodically a filter on the in ram tree in order to
> drop the less performing and the less promising block.
>
> The less performing block involved on a deduplication operation since the last
> run of the filter would be kept because they are promising so they would
> survive and have a chance to climb among the top performers.
>
> The less performing block not involved in a deduplication operation since the
> last run of the filter would be definitively dropped from the HashNode tree
> since they are loosers.
>
> The center of the bell curve would be kept since they are champions.
>
> This way this ram based implementation could offer speed while it's memory
> usage
> being limited.
This means inline dedup is opportunistic and not guaranteed to catch
every dedup.
There needs to be a trade-off between a hash's dedup score and its age.
Young hashes are allowed to stay for a while, even with low dedup scores
so they have a chance to accumulate dedups.
I still think a lookup data structure that spills to disk is better, but
perhaps you have data that shows it's reasonable to expect decent dedup
rates with the opportunistic approach?
Stefan
- [Qemu-devel] [RFC V6 14/33] qcow2: Create qcow2_is_cluster_to_dedup., (continued)
- [Qemu-devel] [RFC V6 14/33] qcow2: Create qcow2_is_cluster_to_dedup., Benoît Canet, 2013/02/06
- [Qemu-devel] [RFC V6 28/33] qcow2: Add check_dedup_l2 in order to check l2 of dedup table., Benoît Canet, 2013/02/06
- [Qemu-devel] [RFC V6 09/33] qcow2: Implement qcow2_compute_cluster_hash., Benoît Canet, 2013/02/06
- [Qemu-devel] [RFC V6 10/33] qcow2: Extract qcow2_dedup_grow_table, Benoît Canet, 2013/02/06
- Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality, Stefan Hajnoczi, 2013/02/08