qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension


From: Kevin Wolf
Subject: Re: [Qemu-block] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension
Date: Wed, 7 Dec 2016 11:44:25 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Am 06.12.2016 um 16:21 hat Eric Blake geschrieben:
> On 12/06/2016 03:25 AM, Kevin Wolf wrote:
> > Am 06.12.2016 um 00:42 hat Eric Blake geschrieben:
> >> While not directly related to NBD_CMD_WRITE_ZEROES, the qemu
> >> team discovered that it is useful if a server can advertise
> >> whether an export is in a known-all-zeroes state at the time
> >> the client connects.
> > 
> > Does a server usually have the information to set this flag, other than
> > querying the block status of all blocks at startup? If so, the client
> > could just query this by itself.
> 
> Well, only if the client can query information at all (we don't have the
> documentation finished for extent queries, let alone a reference
> implementation).

Right, but I think we all agree that this is something that is necessary
and will come sooner or later.

> > The patch that was originally sent to qemu-devel just forwarded qemu's
> > .bdrv_has_zero_init() call to the server. However, what this function
> > returns is not a known-all-zeroes state on open, but just a
> > known-all-zeroes state immediately after bdrv_create(), i.e. creating a
> > new image. Then it becomes information that is easy to get and doesn't
> > involve querying all blocks (e.g. true for COW image formats, true for
> > raw on regular files, false for raw on block devices).
> 
> Just because the NBD spec describes the bit does NOT require that
> servers HAVE to set the bit on all images that are all zeroes.  It is
> perfectly compliant if the server never advertises the bit.

True, but if no server exists that would actually make use of the
feature, it's kind of useless to include it in the spec.

I think we should have concrete use cases in mind when extending the
spec, and explain them in the commit message. Just "maybe this could be
useful for someone sometime" isn't a good enough justification if you
ask me.

> That said, I think there are cases where qemu can easily advertise the
> bit.
> 
> I _do_ agree that it is NOT as trivial as the qemu server just
> forwarding the value of .bdrv_has_zero_init() - the server HAS to prove
> that no data has been written to the image.  But for a qcow2 image just
> created with qemu-img, it is a fairly easy proof: If the L1 table has
> all-zero entries, then the image has not been written to yet.  Reading
> the L1 table for all-zeroes is only a single cluster read, which is MUCH
> faster than crawling the entire image for extent status.  And for
> regular files, a single lseek(SEEK_DATA) is sufficient to see if the
> entire image is currently sparse.
> 
> Note that I only proposed the NBD implementation - it still remains to
> be coded into the qemu code for the client to make use of the bit
> (fairly easy: if the bit is set, the client can make its own
> .bdrv_has_zero_init() return true), as well as for the server to set the
> bit (harder: the server has to check .bdrv_has_zero_init() of the
> wrapped image, but also has to prove the image is still unwritten).
> Maybe this means that qemu's block layer wants to add a new
> .bdrv_has_been_written() [or whatever name] to better abstract the proof
> across drivers.  But those patches would be qemu 2.9 material, and do
> not need to further cc the NBD list.

qemu doesn't really know whether an image has been written to since it
has been created. The interesting case is probably where the image is
created externally with qemu-img before it's exported either with
qemu-nbd or the builtin server, and then we use it as a mirror target.

Even in the rare cases where both image creation and the NBD server are
in the same process, bdrv_create() doesn't work on a BlockDriverState,
but just on a filename. So even then you would have to do hacks like
remembering file names between create and the first open or something
like that.

> > This is useful for 'qemu-img convert', which creates an image and then
> > writes the whole contents, but I'm not sure if this property is
> > applicable for NBD, which I think doesn't even have a create operation.
> 
> Another option on the NBD server side is to create a server option -
> when firing up a server to serve a particular file as an export, the
> user can explicitly tell the server to advertise the bit because the
> user has side knowledge that the file was just created (and then the
> burden of misbehavior is on the user if they mistakenly request the
> advertisement when it is not true).

Maybe that's the only practical approach.

Kevin

Attachment: pgp7IwQaldOKO.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]