[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Nbd] [PATCH] nbd: fix trim/discard commands with a len

From: Quentin Casasnovas
Subject: Re: [Qemu-devel] [Nbd] [PATCH] nbd: fix trim/discard commands with a length bigger than NBD_MAX_BUFFER_SIZE
Date: Tue, 10 May 2016 17:45:45 +0200
User-agent: Mutt/1.5.24 (2015-08-30)

On Tue, May 10, 2016 at 04:38:29PM +0100, Alex Bligh wrote:
> Eric,
> On 10 May 2016, at 16:29, Eric Blake <address@hidden> wrote:
> >>> Maybe we should revisit that in the spec, and/or advertise yet another
> >>> block size (since the maximum size for a trim and/or write_zeroes
> >>> request may indeed be different than the maximum size for a read/write).
> >> 
> >> I think it's up to the server to either handle large requests, or
> >> for the client to break these up.
> > 
> > But the question at hand here is whether we should permit servers to
> > advertise multiple maximum block sizes (one for read/write, another one
> > for trim/write_zero, or even two [at least qemu tracks a separate
> > maximum trim vs. write_zero sizing in its generic block layer]), or
> > merely stick with the current wording that requires clients that honor
> > maximum block size to obey the same maximum for ALL commands, regardless
> > of amount of data sent over the wire.
> In my view, we should not change this. Block sizes maxima are not there
> to support DoS prevention (that's a separate phrase). They are there
> to impose maximum block sizes. Adding a different maximum block size
> for different commands is way too overengineered. There are after
> all reasons (especially without structured replies) why you'd want
> different maximum block sizes for writes and reads. If clients support
> block sizes, they will necessarily have to have the infrastructure
> to break requests up.
> IE maximum block size should continue to mean maximum block size.
> >> 
> >> The core problem here is that the kernel (and, ahem, most servers) are
> >> ignorant of the block size extension, and need to guess how to break
> >> things up. In my view the client (kernel in this case) should
> >> be breaking the trim requests up into whatever size it uses as the
> >> maximum size write requests. But then it would have to know about block
> >> sizes which are in (another) experimental extension.
> > 
> > Correct - no one has yet patched the kernel to honor block sizes
> > advertised through what is currently an experimental extension.
> Unsurprising at it's still experimental, and only settled down a couple
> of weeks ago :-)
> >  (We
> > have ioctl(NBD_SET_BLKSIZE) which can be argued to set the kernel's
> > minimum block size,
> Technically that is 'out of band transmission of block size
> constraints' :-)
> > but I haven't audited whether the kernel actually
> > guarantees that all client requests are sent aligned to the value passed
> > that way - but we have nothing to set the maximum size,
> indeed
> > and are at the
> > mercy of however the kernel currently decides to split large requests).
> I am surprised TRIM doesn't get broken up the same way READ and WRITE
> do.

I'm by no mean an expert in this, but why would the kernel break up those
TRIM commands?  After all, breaking things up makes sense when the length
of the request is big, not that much when it only contains the request
header, which is the case for TRIM commands.

What am I missing?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]