qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES


From: Denis V. Lunev
Subject: Re: [Qemu-devel] [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command
Date: Thu, 18 Feb 2016 07:46:08 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1

On 02/17/2016 11:58 PM, Eric Blake wrote:
On 02/17/2016 11:10 AM, Denis V. Lunev wrote:
This patch proposes a new command to reduce the amount of data passed
through the wire when it is known that the data is all zeroes. This
functionality is generally useful for mirroring or backup operations.

Currently available NBD_CMD_TRIM command can not be used as the
specification explicitely says that "a client MUST NOT make any
s/explicitely/explicitly/

assumptions about the contents of the export affected by this
[NBD_CMD_TRIM] command, until overwriting it again with `NBD_CMD_WRITE`"

Particular use case could be the following:

QEMU project uses own implementation of NBD server to transfer data
in between different instances of QEMU. Typically we tranfer VM virtual
s/tranfer/transfer/

disks over this channel. VM virtual disks are sparse and thus the
efficiency of backup and mirroring operations could be improved a lot.

Signed-off-by: Denis V. Lunev <address@hidden>
---
  doc/proto.md | 7 +++++++
  1 file changed, 7 insertions(+)

diff --git a/doc/proto.md b/doc/proto.md
index 43065b7..c94751a 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -241,6 +241,8 @@ immediately after the global flags field in oldstyle 
negotiation:
    schedule I/O accesses as for a rotational medium
  - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports
    `NBD_CMD_TRIM` commands
+- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server
+  supports `NBD_CMD_WRITE_ZEROES` commands
##### Client flags @@ -446,6 +448,11 @@ The following request types exist:
      about the contents of the export affected by this command, until
      overwriting it again with `NBD_CMD_WRITE`.
+* `NBD_CMD_WRITE_ZEROES` (6)
+
+    A request to write zeroes. The command is functional equivalent of
+    the NBD_WRITE_COMMAND but without payload sent through the channel.
This lets us push holes during writes.
from my point this allows client to apply his policy. For QCOW2 output target the client could skip the block. For RAW file he could decide whether to use UNMAP
and produce sparse file or use fallocate.
  Do we have the converse
operation, that is, an easy way to query if a block of data will read as
all zeroes, and therefore the client can bypass reading that portion of
the disk (in other words, an equivalent to lseek(SEEK_HOLE/SEEK_DATA))?

exactly!

static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
...
ret = bdrv_get_block_status_above(source, NULL, sector_num, <------- query block state
                                      nb_sectors, &pnum, &file);
    if (ret < 0 || pnum < nb_sectors ||
            (ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
        bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
                       mirror_read_complete, op);
    } else if (ret & BDRV_BLOCK_ZERO) {
bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors, <------ skip read op if allowed
                              s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
                              mirror_write_complete, op);
    } else {
        assert(!(ret & BDRV_BLOCK_DATA));
        bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
                         mirror_write_complete, op);
    }
    return delay_ns;

Actually I have tried early at day begins to add .bdrv_co_write_zeroes
callback to NBD and it just works as expected. The problem is that
callback can not be written using NDB_SEND_TRIM to conform with the
NBD spec. But in QEMU -> QEMU communication it just works.

http://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg03810.html

Den



reply via email to

[Prev in Thread] Current Thread [Next in Thread]