[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] nbd: fix trim/discard commands with a length bi

From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH] nbd: fix trim/discard commands with a length bigger than NBD_MAX_BUFFER_SIZE
Date: Tue, 10 May 2016 08:01:00 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

[adding nbd-devel, qemu-block]

On 05/06/2016 02:45 AM, Quentin Casasnovas wrote:
> When running fstrim on a filesystem mounted through qemu-nbd with
> --discard=on, fstrim would fail with I/O errors:
>   $ fstrim /k/spl/ice/
>   fstrim: /k/spl/ice/: FITRIM ioctl failed: Input/output error
> and qemu-nbd was spitting these:
>   nbd.c:nbd_co_receive_request():L1232: len (94621696) is larger than max len 
> (33554432)

> The length of the request seems huge but this is really just the filesystem
> telling the block device driver that "this length should be trimmed", and,
> unlike for a NBD_CMD_READ or NBD_CMD_WRITE, we'll not try to read/write
> that amount of data from/to the NBD socket.  It is thus safe to remove the
> length check for a NBD_CMD_TRIM.
> I've confirmed this with both the protocol documentation at:
>  https://github.com/yoe/nbd/blob/master/doc/proto.md

Hmm. The current wording of the experimental block size additions does
NOT allow the client to send a NBD_CMD_TRIM with a size larger than the
maximum NBD_CMD_WRITE:

Maybe we should revisit that in the spec, and/or advertise yet another
block size (since the maximum size for a trim and/or write_zeroes
request may indeed be different than the maximum size for a read/write).

But since the kernel is the one sending the large length request, and
since you are right that this is not a denial-of-service in the amount
of data being sent in a single NBD message, I definitely agree that qemu
would be wise as a quality-of-implementation to allow the larger size,
for maximum interoperability, even if it exceeds advertised limits (that
is, when no limits are advertised, we should handle everything possible
if it is not so large as to be construed a denial-of-service, and
NBD_CMD_TRIM is not large; and when limits ARE advertised, a client that
violates limits is out of spec but we can still be liberal and respond
successfully to such a client rather than having to outright reject it).
 So I think this patch is headed in the right direction.

> and looking at the kernel side implementation of the nbd device
> (drivers/block/nbd.c) where it only sends the request header with no data
> for a NBD_CMD_TRIM.
> With this fix in, I am now able to run fstrim on my qcow2 images and keep
> them small (or at least keep their size proportional to the amount of data
> present on them).
> Signed-off-by: Quentin Casasnovas <address@hidden>
> CC: Paolo Bonzini <address@hidden>
> CC: <address@hidden>
> CC: <address@hidden>
> CC: <address@hidden>

This is NOT trivial material and should not go in through that tree.
However, I concur that it qualifies for a backport on a stable branch.

> ---
>  nbd.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> diff --git a/nbd.c b/nbd.c
> index b3d9654..e733669 100644
> --- a/nbd.c
> +++ b/nbd.c
> @@ -1209,6 +1209,11 @@ static ssize_t nbd_co_send_reply(NBDRequest *req, 
> struct nbd_reply *reply,
>      return rc;
>  }
> +static bool nbd_should_check_request_size(const struct nbd_request *request)
> +{
> +        return (request->type & NBD_CMD_MASK_COMMAND) != NBD_CMD_TRIM;
> +}
> +
>  static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request 
> *request)
>  {
>      NBDClient *client = req->client;
> @@ -1227,7 +1232,8 @@ static ssize_t nbd_co_receive_request(NBDRequest *req, 
> struct nbd_request *reque
>          goto out;
>      }
> -    if (request->len > NBD_MAX_BUFFER_SIZE) {
> +    if (nbd_should_check_request_size(request) &&
> +        request->len > NBD_MAX_BUFFER_SIZE) {

I'd rather sort out the implications of this on the NBD protocol before
taking anything into qemu.  We've got time on our hand, so let's use it
to get this right.  (That, and I have several pending patches that
conflict with this as part of adding WRITE_ZEROES and INFO_BLOCK_SIZE
support, where it may be easier to resubmit this fix on top of my
pending patches).

>          LOG("len (%u) is larger than max len (%u)",
>              request->len, NBD_MAX_BUFFER_SIZE);
>          rc = -EINVAL;

Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]