qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Block alignment of qcow2 compress driver


From: Eric Blake
Subject: Re: Block alignment of qcow2 compress driver
Date: Fri, 28 Jan 2022 15:22:26 -0600
User-agent: NeoMutt/20211029-256-77b59a

On Fri, Jan 28, 2022 at 01:30:53PM +0000, Richard W.M. Jones wrote:
> > 
> > In qcow2, only the whole cluster can be compressed, so writing
> > compressed data means having to write the whole cluster.  qcow2
> > could implement the padding by itself, but we decided to just leave
> > the burden of only writing full clusters (with the COMPRESSED write
> > flag) on the callers.
> 
> I feel like this may be a bug in what qemu-nbd advertises.  Currently
> it is:
> 
> $ qemu-nbd -t --image-opts 
> driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2
>  &
> [2] 2068900
> $ nbdinfo nbd://localhost

>               block_size_minimum: 65536    <---
>               block_size_preferred: 65536
>               block_size_maximum: 33554432
> 
> block_size_preferred is (rightly) set to 64K, as that's what the
> compress + qcow2 combination prefers.
> 
> But block_size_minimum sounds as if it should be 512 or 1, if qemu-nbd
> is able to reassemble smaller than preferred requests, even if they
> are suboptimal.

When compression is involved, 64k is the minimum block size at the
qcow2 layer, but the qemu NBD layer is relying on the generic block
core code to do RMW on anything smaller than that.  If the RMW doesn't
work, we may have a bug in the block layer.  Even if it does appear to
work, I'm not sure whether the block layer is able to recompress a
cluster - it may be that the act of RMW on a partially-written
initially-compressed cluster causes that cluster to no longer be
compressed, at which point, while your write succeeded, you are no
longer getting any compression.

So, while it is a nice QoI feature of qemu-nbd that we can rely on the
block layer RMW to accept client requests that were smaller than the
advertised minimum block size, I still think the advertised size is
correct, and that the client is in violation of the spec if it is
requesting but then not honoring the advertised size.  And yes, while
it is a pain to hack nbdcopy to pay more attention to block sizing, I
think in the long run it will be worth it.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]