qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation
Date: Mon, 10 Apr 2017 16:32:23 +0100
User-agent: Mutt/1.8.0 (2017-02-23)

On Fri, Apr 07, 2017 at 03:01:29PM +0200, Kevin Wolf wrote:
> Am 07.04.2017 um 14:20 hat Stefan Hajnoczi geschrieben:
> > On Thu, Apr 06, 2017 at 06:01:48PM +0300, Alberto Garcia wrote:
> > > Here are the results (subcluster size in brackets):
> > > 
> > > |-----------------+----------------+-----------------+-------------------|
> > > |  cluster size   | subclusters=on | subclusters=off | Max L2 cache size |
> > > |-----------------+----------------+-----------------+-------------------|
> > > |   2 MB (256 KB) |   440 IOPS     |  100 IOPS       | 160 KB (*)        |
> > > | 512 KB  (64 KB) |  1000 IOPS     |  300 IOPS       | 640 KB            |
> > > |  64 KB   (8 KB) |  3000 IOPS     | 1000 IOPS       |   5 MB            |
> > > |  32 KB   (4 KB) | 12000 IOPS     | 1300 IOPS       |  10 MB            |
> > > |   4 KB  (512 B) |   100 IOPS     |  100 IOPS       |  80 MB            |
> > > |-----------------+----------------+-----------------+-------------------|
> > > 
> > >                 (*) The L2 cache must be a multiple of the cluster
> > >                     size, so in this case it must be 2MB. On the table
> > >                     I chose to show how much of those 2MB are actually
> > >                     used so you can compare it with the other cases.
> > > 
> > > Some comments about the results:
> > > 
> > > - For the 64KB, 512KB and 2MB cases, having subclusters increases
> > >   write performance roughly by three. This happens because for each
> > >   cluster allocation there's less data to copy from the backing
> > >   image. For the same reason, the smaller the cluster, the better the
> > >   performance. As expected, 64KB clusters with no subclusters perform
> > >   roughly the same as 512KB clusters with 64KB subclusters.
> > > 
> > > - The 32KB case is the most interesting one. Without subclusters it's
> > >   not very different from the 64KB case, but having a subcluster with
> > >   the same size of the I/O block eliminates the need for COW entirely
> > >   and the performance skyrockets (10 times faster!).
> > > 
> > > - 4KB is however very slow. I attribute this to the fact that the
> > >   cluster size is so small that a new cluster needs to be allocated
> > >   for every single write and its refcount updated accordingly. The L2
> > >   and refcount tables are also so small that they are too inefficient
> > >   and need to grow all the time.
> > > 
> > > Here are the results when writing to an empty 40GB qcow2 image with no
> > > backing file. The numbers are of course different but as you can see
> > > the patterns are similar:
> > > 
> > > |-----------------+----------------+-----------------+-------------------|
> > > |  cluster size   | subclusters=on | subclusters=off | Max L2 cache size |
> > > |-----------------+----------------+-----------------+-------------------|
> > > |   2 MB (256 KB) |  1200 IOPS     |  255 IOPS       | 160 KB            |
> > > | 512 KB  (64 KB) |  3000 IOPS     |  700 IOPS       | 640 KB            |
> > > |  64 KB   (8 KB) |  7200 IOPS     | 3300 IOPS       |   5 MB            |
> > > |  32 KB   (4 KB) | 12300 IOPS     | 4200 IOPS       |  10 MB            |
> > > |   4 KB  (512 B) |   100 IOPS     |  100 IOPS       |  80 MB            |
> > > |-----------------+----------------+-----------------+-------------------|
> > 
> > I don't understand why subclusters=on performs so much better when
> > there's no backing file.  Is qcow2 zeroing out the 64 KB cluster with
> > subclusters=off?
> > 
> > It ought to just write the 4 KB data when a new cluster is touched.
> > Therefore the performance should be very similar to subclusters=on.
> 
> No, it can't do that. Nobody guarantees that the cluster contains only
> zeros when we don't write them. It could have been used before and then
> either freed on a qcow2 level or we could be sitting on a block device
> rather than a file.

I thought we had the no-op optimization for clusters allocated at the
end of a POSIX file.  All the more reason to add sub-clusters!

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]