[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Nbd] [PATCH v2 3/3] doc: Propose Structured Read exten

From: Wouter Verhelst
Subject: Re: [Qemu-devel] [Nbd] [PATCH v2 3/3] doc: Propose Structured Read extension
Date: Wed, 30 Mar 2016 22:44:01 +0200
User-agent: Mutt/1.5.24 (2015-08-30)


On Tue, Mar 29, 2016 at 05:01:00PM -0600, Eric Blake wrote:
> +
> +    Defined by the experimental `Structured Read` extension; see below.

(detail: that makes the "structured read" extension be typographically
different from everything else. Either make it all caps, or not

>      A write request. Length and offset define the location and amount of
> @@ -536,13 +556,16 @@ The following error values are defined:
>  * `ENOMEM` (12), Cannot allocate memory.
>  * `EINVAL` (22), Invalid argument.
>  * `ENOSPC` (28), No space left on device.
> +* `EOVERFLOW` (75), Value too large.
>  The server SHOULD return `ENOSPC` if it receives a write request
>  including one or more sectors beyond the size of the device.  It SHOULD
>  return `EINVAL` if it receives a read or trim request including one or
>  more sectors beyond the size of the device.  It also SHOULD map the
> -`EDQUOT` and `EFBIG` errors to `ENOSPC`.  Finally, it SHOULD return
> -`EPERM` if it receives a write or trim request on a read-only export.
> +`EDQUOT` and `EFBIG` errors to `ENOSPC`.  It SHOULD return `EOVERFLOW`
> +on a request to send structured read data without fragmentation but
> +where the length is too large.  Finally, it SHOULD return `EPERM` if
> +it receives a write or trim request on a read-only export.

I'd like some more explicit language here that makes it clear EOVERFLOW
can *only* be used on structured replies. We reduced the set of valid
error numbers a while back for good reason; it would be bad if we
accidentally increase it for existing replies now.

>  The server SHOULD return `EINVAL` if it receives an unknown command.
> @@ -579,7 +602,7 @@ To remedy this, a `SELECT` extension is envisioned. This 
> extension adds
>  two option requests and one error reply type, and extends one existing
>  option reply type.
> +* `NBD_OPT_SELECT` (6)

NAK. The spec is currently consistent in associating numbers to
constants in only *one* place. This is no accident, and I'd like to keep
it that way.

(at least I think it is; if not, that's a bug)

>      The client wishes to select the export with the given name for use
>      in the transmission phase, but does not yet want to move to the
> @@ -613,7 +636,7 @@ option reply type.
>        handle `NBD_REP_ERR_UNSUP`. In this case, they should fall back to
>        using `NBD_OPT_EXPORT_NAME`.
> -* `NBD_OPT_GO`
> +* `NBD_OPT_GO` (7)


>      The client wishes to terminate the negotiation phase and progress to
>      the transmission phase. Possible replies from the server include:
> @@ -635,6 +658,235 @@ option reply type.
>        message if they do not also send it as a reply to the
>        `NBD_OPT_SELECT` message.
> +### `Structured Read` extension
> +
> +Some of the major downsides of the default reply to `NBD_CMD_READ`
> +(without structured replies) are as follows.  First, it is not
> +possible to support partial reads (the command must succeed or fail as
> +a whole, either len bytes of data must be sent or the connection must
> +be closed).  There is no way to efficiently skip over portions of a
> +sparse file that are known to contain all zeroes.  Finally, it is not
> +possible to reliably decode the server traffic without also having
> +context of what pending read requests were sent by the client.
> +
> +To remedy this, a `Structured Read` extension is envisioned. This
> +extension adds a new option request, a new reply type during the
> +transmission phase, and a new command flag, and alters the set of
> +valid replies to an existing command.
> +


> +    The client wishes to use structured reads during the transmission
> +    phase.  The option request has no additional data.
> +
> +    The server replies with one of the following:
> +
> +    - `NBD_REP_ACK`: Structured reads have been negotiated; the server
> +      MUST use structured replies to `NBD_CMD_READ`
> +    - `NBD_REP_UNSUP`: Structured reads are not available; the transmission
                  ^ ERR_

(however, see below ;-)

> +      phase MUST remain the same as if the client had not attempted

This makes it seem as though NBD_REP_ERR_UNSUP has a different meaning
here than it usually does. I think the spec should just say that the
server should reply with NBD_REP_ACK, and then mention that for
backwards compatibility the client should be prepared to handle
NBD_REP_UNSUP too (as is done elsewhere).

That is, if the server implements structured replies, it should allow it
(it makes no sense for the server to disallow structured reads if it
knows about them)

> +    A server MUST NOT send a data payload in a normal reply if
> +    Structured Reads are negotiated.  It is envisioned that all future
> +    extension commands that require a data payload in the response
> +    will require independent option negotiation, and therefore, the
> +    `NBD_CMD_READ` command is the only command that is allowed to use
> +    the data payload of a normal reply, and only when Structured Reads
> +    were not negotiated.  However, for ease of implementation, a
> +    server MAY close the connection rather than entering transmission
> +    phase if, at the end of option haggling, the client has negotiated
> +    another command that requires a structured reply but did not also
> +    negotiate Structured Reads.

(see my comments on this downthread)

> +* `NBD_CMD_FLAG_DF` (bit 1)
> +
> +    Valid during `NBD_CMD_READ`.  SHOULD be set to 1 if the client
> +    requires the server to send at most one data chunk in reply.  MUST
> +    NOT be set unless the client negotiated Structured Reads with the
> +    server.

I realize I'm flip-flopping on whether or not to use a flag bit for
this, but bear with me on this one for a moment.

There is an ioctl NBD_SET_FLAGS which just expects the per-export flags
to be passed to the kernel. By reusing that, there is no need for an
extra kernel call to specify the options the kernel-side client can use.
That still leaves the ability for userspace to detect whether the client
can interpret structured replies, but this could easily be signalled
using a sysfs flag (or similar).

If the client can do something along the lines of:

check sysfs (or whatever)
get flags from server
call NBD_SET_FLAGS ioctl

and that signals *everything* to both the kernel and the server, then we
don't need any extra uapi calls to be defined. This is probably a good

However, in order for that to be possible, the per-export flags field
needs to have a bit set to signal the server's understanding of, and
client's permission to use, NBD_CMD_FLAG_DF. As such, I think we need an
extra flag in the per-export flags field of the protocol, even though
we've already negotiated structured reads and I expressed the preference
that this shouldn't be decoupled.

Yes, that's slightly ugly.

Thinking about this, I suppose it makes sense to rename the "global"
flags field as the "negotiation flags field" which signals incompatible
changes in the negotiation phase, and the "per-export" flags field as
the "transmission flags field" which signals features the client can use
during transmission, or something along those lines. Thoughts?


no further comments (other than what's already been said)

< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12

reply via email to

[Prev in Thread] Current Thread [Next in Thread]