qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/6] qcow2: Make the L2 cache cover the whole im


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH 0/6] qcow2: Make the L2 cache cover the whole image by default
Date: Mon, 30 Jul 2018 14:44:53 +0200
User-agent: Mutt/1.9.1 (2017-09-22)

Am 30.07.2018 um 14:13 hat Leonid Bloch geschrieben:
> On 07/30/2018 01:55 PM, Kevin Wolf wrote:
> > Am 29.07.2018 um 23:27 hat Leonid Bloch geschrieben:
> > > This series makes the qcow2 L2 cache cover the entire image by default.
> > > The importance of this change is in noticeable performance improvement,
> > > especially with heavy random I/O. The memory overhead is very small:
> > > only 1 MB of cache for every 8 GB of image size. On systems with very
> > > limited RAM the maximal cache size can be limited by the existing
> > > cache-size and l2-cache-size options.
> > > 
> > > The L2 cache is also resized accordingly, by default, if the image is
> > > resized.
> > 
> > I agree with changing the defaults, I would have proposed a change
> > myself soon. We have been offering cache size options for a long time,
> > and management tools are still ignoring them. So we need to do something
> > in QEMU.
> > 
> > Now, what the exact defaults should be, is something to use a bit more
> > thought on. You say "the memory overhead is very small", without going
> > into details. Let's check.
> > 
> > This is the formula from your patch (unit is bytes):
> > 
> >      uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
> > 
> > So we get:
> > 
> >      64k clusters, 8G image:   1M (maximum covered by old default)
> >      64k clusters, 1T image: 128M
> >      1M clusters, 1T image:    8M
> >      512b clusters, 1T image: 16G
> > 
> > 1T is by far not the maximum size for a qcow2 image, and 16G memory
> > usage is definitely not "very small" overhead. Even the 128M for the
> > default cluster size are already a lot. So the simplistic approch of
> > this series won't work as a default.
> > 
> > Let's refine it a bit:
> > 
> > * Basing it on max_l2_cache sounds fine generally
> > * Cap it at 32 MB (?) by default
> 
> Yes! A great idea! Definitely necessary.
> 
> > * Enable cache-clean-interval=30 (?) by default to compensate a bit for
> >    the higher maximum memory usage
> 
> Do you think that we need this if we cap the cache at 32 MB? And that's only
> the cap. 256 GB+ images are not that often used. And compared to the overall
> QEMU overhead, of over 500 MB, extra 32 in the worst case is not that much,
> considering the performance gain.

Don't forget that if you take a few snapshots so you get a backing file
chain, every layer in the chain has its own cache. So capping at 32 MB
with 9 backing files still means that we use 320 MB.

And honestly, is there a real reason not to use cache-clean-interval by
default? Even if it doesn't help much in some cases, it should hardly
ever hurt. If the image was completely idle for 30 seconds (or whatever
number we choose), having to reload some 64k from the disk on the next
access shouldn't make a big difference.

> > Another thing I just noticed while looking at the code is that
> > cache-clean-interval only considers blocks that aren't dirty, but
> > doesn't take measures to get dirty blocks written out, so we depend on
> > the guest flushing the cache before we can get free the memory. Should
> > we attempt to write unused dirty entries back? Berto?
> > 
> > > The reasons for making this behavior default, unlike in the patches I
> > > have sent previously, are the following:
> > > 1) Unelegant complications with making an option to accept either a
> > >     size or a string, while outputting a proper error message.
> > > 2) More bulky logic to sort out what to do if the image is being resized
> > >     but the (defined) overall cache size is too small to contain the
> > >     new l2-cache-size.
> > > (Making this behavior default resolves all of these technical issues
> > > neatly)
> > 
> > The default must always correspond to some explicit setting. We can only
> > change defaults relatively freely because we can assume that if a user
> > cared about the setting, they would have specified it explicitly. If
> > it's not possible to specify a setting explicitly, we can't make this
> > assumption any more, so we'd be stuck with the default forever.
> > 
> > Now, considering what I wrote above, an alternate wouldn't be the best
> > way to represent this any more. We do have a cache size again (the 32 MB
> > cap) even while trying to cover the whole image.
> > 
> > Maybe the right interface with this in mind would be a boolean option
> > that specifies whether the given cache sizes are exact values (old
> > semantics) or maximum values, which are limited to what the actual
> > images size requires. If you do want the "real" full mode, you'd set
> > l2-cache-size=INT64_MAX and exact-cache-sizes=false (this could use a
> > better name). The new default would be l2-cache-size=32M,
> > exact-cache-sizes=false. The old default is cache-size=1M,
> > exact-cache-sizes=true.
> 
> The existing cache-size and l2-cache-size options are already documented as
> maximal values. It would make sense to actually make them as such: the
> caches will be as big as necessary to cover the entire image (no need for
> them to be more than that) but they will be capped by the current options,
> while the new default of l2-cache-size will be 32M. Why would one need
> exact-cache-sizes, if they would be MIN(needed, max)?
> Does it sound reasonable?

I think you have a point there.

There is one case where we need to cover more than the virtual disk
size, and that is the VM state for internal snapshots. These are
sequential accesses, though, so it doesn't matter for it much how many
L2 tables we have cached. And having to reload some tables after taking
or loading a snapshot sounds okay, too.

So essentially everything boils down to applying max_l2_cache even if a
L2 size was explicitly specified (which makes cache-size=INT64_MAX a
reasonable option), changing the default from cache-size=1M to
cache-size=32M (covers 256 GB with the default 64k clusters), and
enabling cache-clean-interval by default. Right?

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]