qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] RFC: Reducing the size of entries in the qcow2 L2 cache


From: Alberto Garcia
Subject: Re: [Qemu-block] RFC: Reducing the size of entries in the qcow2 L2 cache
Date: Wed, 20 Sep 2017 15:10:45 +0200
User-agent: Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (i586-pc-linux-gnu)

On Wed 20 Sep 2017 09:06:20 AM CEST, Kevin Wolf wrote:
>> |-----------+--------------+-------------+---------------+--------------|
>> | Disk size | Cluster size | L2 cache    | Standard QEMU | Patched QEMU |
>> |-----------+--------------+-------------+---------------+--------------|
>> | 16 GB     | 64 KB        | 1 MB [8 GB] | 5000 IOPS     | 12700 IOPS   |
>> |  2 TB     |  2 MB        | 4 MB [1 TB] |  576 IOPS     | 11000 IOPS   |
>> |-----------+--------------+-------------+---------------+--------------|
>> 
>> The improvements are clearly visible, but it's important to point out
>> a couple of things:
>> 
>>    - L2 cache size is always < total L2 metadata on disk (otherwise
>>      this wouldn't make sense). Increasing the L2 cache size improves
>>      performance a lot (and makes the effect of these patches
>>      disappear), but it requires more RAM.
>
> Do you have the numbers for the two cases abve if the L2 tables
> covered the whole image?

Yeah, sorry, it's around 60000 IOPS in both cases (more or less what I
also get with a raw image).

>>    - Doing random reads over the whole disk is probably not a very
>>      realistic scenario. During normal usage only certain areas of the
>>      disk need to be accessed, so performance should be much better
>>      with the same amount of cache.
>>    - I wrote a best-case scenario test (several I/O jobs each accesing
>>      a part of the disk that requires loading its own L2 table) and my
>>      patched version is 20x faster even with 64KB clusters.
>
> I suppose you choose the scenario so that the number of jobs is larger
> than the number of cached L2 tables without the patch, but smaller than
> than the number of cache entries with the patch?

Exactly, I should have made that explicit :) I had 32 jobs, each one of
them limited to a small area (32MB), so with 4K pages you only need
128KB of cache memory (vs 2MB with the current code).

> We will probably need to do some more benchmarking to find a good
> default value for the cached chunks. 4k is nice and small, so we can
> cover many parallel jobs without using too much memory. But if we have
> a single sequential job, we may end up doing the metadata updates in
> small 4k chunks instead of doing a single larger write.

Right, although a 4K table can already hold pointers to 512 data
clusters, so even if you do sequential I/O you don't need to update the
metadata so often, do you?

I guess the default value should probably depend on the cluster size.

>>    - We need a proper name for these sub-tables that we are loading
>>      now. I'm actually still struggling with this :-) I can't think of
>>      any name that is clear enough and not too cumbersome to use (L2
>>      subtables? => Confusing. L3 tables? => they're not really that).
>
> L2 table chunk? Or just L2 cache entry?

Yeah, something like that, but let's see how variables end up being
named :)

Berto



reply via email to

[Prev in Thread] Current Thread [Next in Thread]