[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 3/3] doc: Propose Structured Read extension

From: Alex Bligh
Subject: Re: [Qemu-devel] [PATCH v2 3/3] doc: Propose Structured Read extension
Date: Wed, 30 Mar 2016 07:50:06 +0100


There's a lot in common between our two proposals now (unsurprisingly).
You've highlighted the differences in the other mail and I'll
comment on them there. You may want to steal some of my wording as
I think there are bits I've got that you haven't (as well as vice versa).
But I'm inclined to use yours as a base unless you particularly
like mine.

Comments inline below.


On 30 Mar 2016, at 00:01, Eric Blake <address@hidden> wrote:

> +    While the server is permitted to send at most one normal reply (or
> +    else close the connection), a command that uses structured replies
> +    may document that the server is permitted to send mutiple replies,
> +    all sharing the same handle,

The thought is fine, but the language is confusing. I think this is
a single reply, made up of multiple parts (I called them chunks). You've
called them multiple replies, which I think makes things less clear.
Also below you've started using my 'chunk' language anyway!

> by using the `NBD_REPLY_FLAG_DONE`
> +    (bit 0) to delineate the final reply. The server MAY interleave
> +    intermediate replies to one structured command with replies
> +    relating to a different handle.


The argument against this route is that now there are essentially
two ways to end a chain of chunks (with and without a NONE chunk)
which is necessary for the reasons you set out. On balance I like it though.

> +
> +    A server MUST NOT send a data payload in a normal reply if
> +    Structured Reads are negotiated.  It is envisioned that all future
> +    extension commands that require a data payload in the response
> +    will require independent option negotiation, and therefore, the
> +    `NBD_CMD_READ` command is the only command that is allowed to use
> +    the data payload of a normal reply, and only when Structured Reads
> +    were not negotiated.

See other email.

>  However, for ease of implementation, a
> +    server MAY close the connection rather than entering transmission
> +    phase if, at the end of option haggling, the client has negotiated
> +    another command that requires a structured reply but did not also
> +    negotiate Structured Reads.

That's pretty yucky given a reconnect will achieve the same result
and you'll end up in an infinite retry loop.

Wouldn't a better route be simply to say that implementing certain
commands (server or client sides) requires support of structured

> +    - `NBD_REPLY_TYPE_NONE` (0)
> +
> +      *length* MUST be 0 (and the payload field omitted).  This type
> +      SHOULD be used only as the final reply (that is, when
> +      `NBD_REPLY_FLAG_DONE` is set), and implies that the overall
> +      client request was successfully completed.

I think this would be clearer as 'SHOULD NOT be used other than as the
final reply'. Because you are also saying (I think) that you need not
have it as the final reply - it's just as good in a non-errored
reply to have NBD_REPLY_FLAG_DONE set on the last data packet (provided
you know it's not going to error before starting to send it).


> +    The server MAY split the reply into any number of data chunks,
> +    using reply types of `NBD_REPLY_TYPE_OFFSET_DATA` or
> +    `NBD_REPLY_TYPE_OFFSET_HOLE`; each chunk MUST describe at least
> +    one byte, although to minimize overhead, the server SHOULD use
> +    chunks no smaller than 512 bytes where possible (the first and

This is a good idea, but rather than 'no smaller than 512 bytes', as
it's a 'SHOULD', could we have 'the server SHOULD use chunks each
an integer multiple of 512 bytes where possible' (you already have
a carve out for the first and last).


> +    If no error is detected, then the server MUST send enough chunks
> +    to cover the bytes requested.  The server MAY set the
> +    `NBD_REPLY_FLAG_DONE` on the final data chunk,

In which case it MUST NOT send any further non-data chunks
(e.g. an error chunk or a NONE chunk)

> to minimize
> +    traffic, but MUST NOT do so if it would still be possible to
> +    detect an error while transmitting the chunk.  If the last data
> +    chunk is not the final reply, the server MUST use
> +    `NBD_REPLY_TYPE_NONE` as the final reply to indicate success.

or an error chunk to indicate an error, and these final chunk MUST have

> +    If an error is detected, the server MUST send padding bytes to
> +    complete the current chunk (if any), MUST report the error with a
> +    reply type of either `NBD_REPLY_TYPE_ERROR` or
> +    `NBD_REPLY_TYPE_ERROR_OFFSET`, and MAY end the sequence of replies
> +    without sending the total number of bytes requested.  If one or
> +    more offset errors are reported, the client MAY assume that all
> +    data in chunks not including the offset,

"the offset(s)"

> and all data within the
> +    affected chunk

"within each affected chunk"

> but prior to the offset,

"prior to the relevant offset"

> is valid; the client MAY
> +    NOT assume anything about data validity if no offset is provided.

These multiple error chunks are neat. However, I suspect lazy implementors
may just send an error without an offset.

> +    The server MAY send additional chunks or offset error replies, if
> +    `NBD_REPLY_FLAG_DONE` was not set, but MUST ensure the final reply
> +    also reports an error (that is, the final reply MUST NOT use
> +    `NBD_REPLY_TYPE_NONE`), and MAY reuse an offset reported earlier
> +    in constructing the final reply.

I'm not sure I get that bit. Why don't you make an errorred reply simply
one that contains one or more error chunks. An errorred reply need not contain
all the data requested (though each chunk must be complete). A reply that
isn't errorred needs not contain all the data requested. Why do you
need anything stronger than that? So if you have a parallelised server which
is simply sending several reads in parallel (think Ceph) it sends the
result from each thread, possibly followed by an error packet, and some
other thread notices when all of these have completed and sends a
NBD_REPLY_TYPE_NONE packet (always, error or not) to close of use of the
handle. This seems perfectly natural and no harder for the client to deal
with, but you are prohibiting it.

>  A server SHOULD NOT mix
> +    to the same request.
> +
> +    A client MAY close the connection if it detects that the server
> +    has sent invalid chunks (such as overlapping data, or not enough
> +    data before claiming success).
> +
> +    In order to avoid the burden of reassembly, the client MAY set the
> +    `NBD_CMD_FLAG_DF` flag (bit 1), which instructs the server to not
> +    fragment the reply.  If this flag is set, the server MUST send at
> +    most one `NBD_REPLY_TYPE_OFFSET_DATA` or
> +    `NBD_REPLY_TYPE_OFFSET_HOLE`, although it MAY still send more than
> +    reply (for error reporting, or a final `NBD_REPLY_TYPE_NONE`).  If

"the flag is set and"

> +    the client's length request is larger than 65,536 bytes (or if a
> +    later extension adds a way to negotiate a larger maximum fragment
> +    size), the server MAY reject the command with `EOVERFLOW`.  The
> +    `EOVERFLOW` error MUST NOT be used if the `NBD_CMD_FLAG_DF` flag
> +    was not set, or if the requested length is no larger than 65,536.
> +
> ## About this file
> This file tries to document the NBD protocol as it is currently
> -- 
> 2.5.5

Alex Bligh

reply via email to

[Prev in Thread] Current Thread [Next in Thread]