qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/2] qcow2: Avoid memory over-allocation on comp


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH 2/2] qcow2: Avoid memory over-allocation on compressed images
Date: Wed, 21 Feb 2018 12:32:23 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

On 02/21/2018 11:39 AM, Kevin Wolf wrote:
See my commit message comment - we have other spots in the code base that
blindly g_malloc(2 * s->cluster_size).

Though is that a reason to do the same in new code or to phase out such
allocations whenever you touch them?

Touché.


And I intended (but sent the email without amending my commit) to use
g_malloc().  But as Berto has convinced me that an externally produced
image can convince us to read up to 4M (even though we don't need that
much to decompress), I suppose that the try_ variant plus checking is
reasonable (and care in NULL'ing out if one but not both allocations
succeed).

Sounds good.

Another thought I had is whether we should do per-request allocation for
compressed clusters, too, instead of having per-BDS buffers.

The only benefit of a per-BDS buffer is that we cache things - multiple sub-cluster reads in a row all from the same compressed cluster benefit from decompressing only once. The drawbacks of a per-BDS buffer: we can't do things in parallel (everything else in qcow2 drops the lock around bdrv_co_pread[v]), so the initial read prevents anything else in the qcow2 layer from progressing.

I also wonder - since we ARE allowing multiple parallel readers in other parts of qcow2 (without a patch, decompression is not in this boat, but decryption and even bounce buffers due to lower-layer alignment constraints are), what sort of mechanisms do we have for using a pool of reusable buffers, rather than having each cluster access that requires a buffer malloc and free the buffer on a per-access basis? I don't know how much time the malloc/free per-transaction overhead adds, or if it is already much smaller than the actual I/O time.

But note that while reusable buffers from a pool would cut down on the per-I/O malloc/free overhead if we switch decompression away from per-BDS buffer, it would still not solve the fact that we only get the caching ability where multiple sub-cluster requests from the same compressed cluster require only one decompression, since that's only possible on a per-BDS caching level.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]