[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Libguestfs] [libnbd PATCH v3 22/22] api: Add nbd_[aio_]block_status
From: |
Richard W.M. Jones |
Subject: |
Re: [Libguestfs] [libnbd PATCH v3 22/22] api: Add nbd_[aio_]block_status_filter() |
Date: |
Thu, 8 Jun 2023 10:55:14 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Thu, May 25, 2023 at 08:01:08AM -0500, Eric Blake wrote:
> As part of extending NBD to support 64-bit lengths, the protocol also
> added an option for servers to allow clients to request filtered
> responses to NBD_CMD_BLOCK_STATUS when more than one meta-context is
> negotiated (see NBD commit e6f3b94a). At the same time as this patch,
> qemu-nbd was taught to support and advertise this feature as a server,
> but does not utilize it as a client (qemu doesn't yet need to connect
> to multiple contexts at once). Thus, addding generic client support
> and enhancing the interop/ test in libnbd is needed to prove that the
> feature is viable and worth standardizing.
> ---
> lib/internal.h | 5 +-
> generator/API.ml | 71 +++++++++++++++--
> generator/states-issue-command.c | 4 +-
> lib/aio.c | 7 +-
> lib/rw.c | 127 ++++++++++++++++++++++++++++++-
> interop/block-status-payload.c | 117 +++++++++++++++++++++++++++-
> interop/block-status-payload.sh | 14 +++-
> info/info-can.sh | 3 +
> 8 files changed, 336 insertions(+), 12 deletions(-)
>
> diff --git a/lib/internal.h b/lib/internal.h
> index 2948b77b..64921de9 100644
> --- a/lib/internal.h
> +++ b/lib/internal.h
> @@ -73,6 +73,8 @@ struct meta_context {
> };
> DEFINE_VECTOR_TYPE (meta_vector, struct meta_context);
>
> +DEFINE_VECTOR_TYPE(uint32_vector, uint32_t);
> +
> struct export {
> char *name;
> char *description;
> @@ -380,7 +382,8 @@ struct command {
> uint64_t cookie;
> uint64_t offset;
> uint64_t count;
> - void *data; /* Buffer for read/write */
> + void *data; /* Buffer for read/write, uint32_vector* for status payload */
> + uint32_vector *ids; /* For block status with payload */
> struct command_cb cb;
> bool initialized; /* For read, true if getting a hole may skip memset */
> uint32_t data_seen; /* For read, cumulative size of data chunks seen */
> diff --git a/generator/API.ml b/generator/API.ml
> index 5a31ce3b..a26ed1da 100644
> --- a/generator/API.ml
> +++ b/generator/API.ml
> @@ -2335,12 +2335,13 @@ "can_block_status_payload", {
> longdesc = "\
> Returns true if the server supports the use of the
> C<LIBNBD_CMD_FLAG_PAYLOAD_LEN> flag to allow filtering of the
> -block status command. Returns
> +block status command (see L<nbd_block_status_filter(3)>). Returns
> false if the server does not. Note that this will never return
> true if L<nbd_get_extended_headers_negotiated(3)> is false."
> ^ non_blocking_test_call_description;
> see_also = [SectionLink "Flag calls"; Link "opt_info";
> - Link "get_extended_headers_negotiated"];
> + Link "get_extended_headers_negotiated";
> + Link "block_status_filter"];
> example = Some "examples/server-flags.c";
> };
>
> @@ -2409,6 +2410,10 @@ "can_meta_context", {
> meta contexts were requested but there is a missing or failed
> attempt at NBD_OPT_SET_META_CONTEXT during option negotiation.
>
> +If the server supports block status filtering (see
> +L<nbd_can_block_status_payload(3)>, this function must return
> +true for any filter name passed to L<nbd_block_status_filter(3)>.
> +
> The single parameter is the name of the metadata context,
> for example C<LIBNBD_CONTEXT_BASE_ALLOCATION>.
> B<E<lt>libnbd.hE<gt>> includes defined constants for well-known
> @@ -2941,9 +2946,12 @@ "block_status_64", {
> information about blocks beginning from the specified
> offset to be returned. The C<count> parameter is a hint: the
> server may choose to return less status, or the final block
> -may extend beyond the requested range. If multiple contexts
> +may extend beyond the requested range. When multiple contexts
> are supported, the number of blocks and cumulative length
> -of those blocks need not be identical between contexts.
> +of those blocks need not be identical between contexts; this
> +command generally returns the status of all negotiated contexts,
> +while some servers also support a filtered request (see
> +L<nbd_can_block_status_payload(3)>, L<nbd_block_status_filter(3)>).
>
> Note that not all servers can support a C<count> of 4GiB or larger;
> L<nbd_get_extended_headers_negotiated(3)> indicates which servers
> @@ -2993,11 +3001,38 @@ "block_status_64", {
> does not exceed C<count> bytes; however, libnbd does not
> validate that the server obeyed the flag."
> ^ strict_call_description;
> - see_also = [Link "block_status";
> + see_also = [Link "block_status"; Link "block_status_filter";
> Link "add_meta_context"; Link "can_meta_context";
> Link "aio_block_status_64"; Link "set_strict_mode"];
> };
>
> + "block_status_filter", {
> + default_call with
> + args = [ UInt64 "count"; UInt64 "offset"; StringList "contexts";
> + Closure extent64_closure ];
> + optargs = [ OFlags ("flags", cmd_flags, Some ["REQ_ONE"; "PAYLOAD_LEN"])
> ];
> + ret = RErr;
> + permitted_states = [ Connected ];
> + shortdesc = "send filtered block status command, with 64-bit callback";
> + longdesc = "\
> +Issue a filtered block status command to the NBD server. If
> +supported by the server (see L<nbd_can_block_status_payload(3)>),
> +this causes metadata context information about blocks beginning
> +from the specified offset to be returned, and with the result
> +limited to just the contexts specified in C<filter>. Note that
> +all strings in C<filter> must be supported by
> +L<nbd_can_meta_context(3)>.
> +
> +All other parameters to this function have the same semantics
> +as in L<nbd_block_status_64(3)>; except that for convenience,
> +the C<flags> parameter may additionally contain or omit
> +C<LIBNBD_CMD_FLAG_PAYLOAD_LEN>."
> +^ strict_call_description;
> + see_also = [Link "block_status_64";
> + Link "can_block_status_payload"; Link "can_meta_context";
> + Link "aio_block_status_filter"; Link "set_strict_mode"];
> + };
> +
> "poll", {
> default_call with
> args = [ Int "timeout" ]; ret = RInt;
> @@ -3667,6 +3702,30 @@ "aio_block_status_64", {
> Link "set_strict_mode"];
> };
>
> + "aio_block_status_filter", {
> + default_call with
> + args = [ UInt64 "count"; UInt64 "offset"; StringList "contexts";
> + Closure extent64_closure ];
> + optargs = [ OClosure completion_closure;
> + OFlags ("flags", cmd_flags, Some ["REQ_ONE"; "PAYLOAD_LEN"])
> ];
> + ret = RCookie;
> + permitted_states = [ Connected ];
> + shortdesc = "send filtered block status command to the NBD server";
> + longdesc = "\
> +Send a filtered block status command to the NBD server.
> +
> +To check if the command completed, call L<nbd_aio_command_completed(3)>.
> +Or supply the optional C<completion_callback> which will be invoked
> +as described in L<libnbd(3)/Completion callbacks>.
> +
> +Other parameters behave as documented in L<nbd_block_status_filter(3)>."
> +^ strict_call_description;
> + see_also = [SectionLink "Issuing asynchronous commands";
> + Link "aio_block_status_64"; Link "block_status_filter";
> + Link "can_meta_context"; Link "can_block_status_payload";
> + Link "set_strict_mode"];
> + };
> +
> "aio_get_fd", {
> default_call with
> args = []; ret = RFd;
> @@ -4201,6 +4260,8 @@ let first_version =
> "opt_extended_headers", (1, 18);
> "aio_opt_extended_headers", (1, 18);
> "can_block_status_payload", (1, 18);
> + "block_status_filter", (1, 18);
> + "aio_block_status_filter", (1, 18);
>
> (* These calls are proposed for a future version of libnbd, but
> * have not been added to any released version so far.
> diff --git a/generator/states-issue-command.c
> b/generator/states-issue-command.c
> index 79136b61..5307731d 100644
> --- a/generator/states-issue-command.c
> +++ b/generator/states-issue-command.c
> @@ -84,7 +84,9 @@ ISSUE_COMMAND.PREPARE_WRITE_PAYLOAD:
> assert (h->cmds_to_issue != NULL);
> cmd = h->cmds_to_issue;
> assert (cmd->cookie == be64toh (h->req.compact.handle));
> - if (cmd->type == NBD_CMD_WRITE) {
> + if (cmd->type == NBD_CMD_WRITE ||
> + (h->extended_headers && cmd->type == NBD_CMD_BLOCK_STATUS &&
> + cmd->flags & NBD_CMD_FLAG_PAYLOAD_LEN)) {
> h->wbuf = cmd->data;
> h->wlen = cmd->count;
> if (cmd->next && cmd->count < 64 * 1024)
> diff --git a/lib/aio.c b/lib/aio.c
> index a419ac32..77b20c32 100644
> --- a/lib/aio.c
> +++ b/lib/aio.c
> @@ -32,8 +32,13 @@ void
> nbd_internal_retire_and_free_command (struct command *cmd)
> {
> /* Free the callbacks. */
> - if (cmd->type == NBD_CMD_BLOCK_STATUS)
> + if (cmd->type == NBD_CMD_BLOCK_STATUS) {
> + if (cmd->ids) {
> + uint32_vector_reset (cmd->ids);
> + free (cmd->ids);
> + }
> FREE_CALLBACK (cmd->cb.fn.extent);
> + }
> if (cmd->type == NBD_CMD_READ)
> FREE_CALLBACK (cmd->cb.fn.chunk);
> FREE_CALLBACK (cmd->cb.completion);
> diff --git a/lib/rw.c b/lib/rw.c
> index bea55fa1..db6bc0bc 100644
> --- a/lib/rw.c
> +++ b/lib/rw.c
> @@ -242,6 +242,26 @@ nbd_unlocked_block_status_64 (struct nbd_handle *h,
> return wait_for_command (h, cookie);
> }
>
> +/* Issue a filtered block status command and wait for the reply. */
> +int
> +nbd_unlocked_block_status_filter (struct nbd_handle *h,
> + uint64_t count, uint64_t offset,
> + char **filter,
> + nbd_extent64_callback *extent64,
> + uint32_t flags)
> +{
> + int64_t cookie;
> + nbd_completion_callback c = NBD_NULL_COMPLETION;
> +
> + cookie = nbd_unlocked_aio_block_status_filter (h, count, offset, filter,
> + extent64, &c, flags);
> + if (cookie == -1)
> + return -1;
> +
> + assert (CALLBACK_IS_NULL (*extent64));
> + return wait_for_command (h, cookie);
> +}
> +
> /* count_err represents the errno to return if bounds check fail */
> int64_t
> nbd_internal_command_common (struct nbd_handle *h,
> @@ -250,6 +270,7 @@ nbd_internal_command_common (struct nbd_handle *h,
> void *data, struct command_cb *cb)
> {
> struct command *cmd;
> + uint32_vector *ids = NULL;
>
> if (h->disconnect_request) {
> set_error (EINVAL, "cannot request more commands after NBD_CMD_DISC");
> @@ -297,10 +318,23 @@ nbd_internal_command_common (struct nbd_handle *h,
> }
> break;
>
> + case NBD_CMD_BLOCK_STATUS:
> + if (data) {
> + ids = data;
> + count = ids->len * sizeof (uint32_t);
> + data = ids->ptr;
> + if (count > MAX_REQUEST_SIZE ||
> + (h->strict & LIBNBD_STRICT_PAYLOAD && count > h->payload_maximum))
> {
> + set_error (ERANGE, "filter set too large");
> + goto err;
> + }
> + break;
> + }
> + /* fallthrough */
> + default:
> /* Other commands are limited by the 32 bit field in the command
> * structure on the wire, unless extended headers were negotiated.
> */
> - default:
> if (!h->extended_headers && count > UINT32_MAX) {
> set_error (ERANGE, "request too large: maximum request size is %"
> PRIu32,
> UINT32_MAX);
> @@ -320,6 +354,7 @@ nbd_internal_command_common (struct nbd_handle *h,
> cmd->offset = offset;
> cmd->count = count;
> cmd->data = data;
> + cmd->ids = ids;
> if (cb)
> cmd->cb = *cb;
>
> @@ -364,8 +399,13 @@ nbd_internal_command_common (struct nbd_handle *h,
> err:
> /* Since we did not queue the command, we must free the callbacks. */
> if (cb) {
> - if (type == NBD_CMD_BLOCK_STATUS)
> + if (type == NBD_CMD_BLOCK_STATUS) {
> + if (ids) {
> + uint32_vector_reset (ids);
> + free (ids);
> + }
> FREE_CALLBACK (cb->fn.extent);
> + }
> if (type == NBD_CMD_READ)
> FREE_CALLBACK (cb->fn.chunk);
> FREE_CALLBACK (cb->completion);
> @@ -609,3 +649,86 @@ nbd_unlocked_aio_block_status_64 (struct nbd_handle *h,
> return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset,
> count, EINVAL, NULL, &cb);
> }
> +
> +int64_t
> +nbd_unlocked_aio_block_status_filter (struct nbd_handle *h,
> + uint64_t count, uint64_t offset,
> + char **filter,
> + nbd_extent64_callback *extent64,
> + nbd_completion_callback *completion,
> + uint32_t flags)
> +{
> + struct command_cb cb = { .fn.extent = *extent64,
> + .completion = *completion };
> + uint32_vector *ids;
> + char *name;
> + size_t i;
> +
> + /* Because this affects wire format, it is more convenient to manage
> + * PAYLOAD_LEN by what was negotiated than to require the user to
> + * have to set it correctly.
> + */
> + if (!h->extended_headers) {
> + set_error (ENOTSUP, "server does not support extended headers");
> + return -1;
> + }
> + flags |= LIBNBD_CMD_FLAG_PAYLOAD_LEN;
> +
> + if (h->strict & LIBNBD_STRICT_COMMANDS) {
> + if (nbd_unlocked_can_block_status_payload (h) != 1) {
> + set_error (EINVAL,
> + "server does not support the block status payload flag");
> + return -1;
> + }
> +
> + if (!h->meta_valid || h->meta_contexts.len == 0) {
> + set_error (ENOTSUP, "did not negotiate any metadata contexts, "
> + "either you did not call nbd_add_meta_context before "
> + "connecting or the server does not support it");
> + return -1;
> + }
> + }
> +
> + ids = calloc (1, sizeof *ids);
> + if (ids == NULL) {
> + set_error (errno, "calloc");
> + return -1;
> + }
> + if (uint32_vector_append (ids, htobe32 (count >> 32)) == -1 ||
> + uint32_vector_append (ids, htobe32 (count)) == -1) {
> + set_error (errno, "realloc");
> + goto fail;
> + }
> +
> + /* O(n^2) search - hopefully filter and negotiated contexts are both small
> */
This is (sort of) remotely exploitable? If the server sends back an
insane number of contexts? I'm not sure.
But I wonder if it would work to keep the context strings sorted, then
sort the filter strings, which could reduce this loop to O(n)-ish
(apart from the sorts).
> + for ( ; (name = *filter) != NULL; filter++) {
> + if (!h->meta_valid) {
> + set_error (EINVAL, "context %s not negotiated", name);
> + goto fail;
> + }
> + for (i = 0; i < h->meta_contexts.len; i++) {
> + struct meta_context *meta = &h->meta_contexts.ptr[i];
> + if (strcmp (name, meta->name) == 0) {
> + if (uint32_vector_append (ids, htobe32 (meta->context_id)) == -1) {
> + set_error (errno, "realloc");
> + goto fail;
> + }
> + break;
> + }
> + }
> + if (i == h->meta_contexts.len) {
> + set_error (EINVAL, "context %s not negotiated", name);
> + goto fail;
> + }
> + }
> +
> + SET_CALLBACK_TO_NULL (*extent64);
> + SET_CALLBACK_TO_NULL (*completion);
> + return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset,
> + count, EINVAL, ids, &cb);
> +
> + fail:
> + uint32_vector_reset (ids);
> + free (ids);
> + return -1;
> +}
> diff --git a/interop/block-status-payload.c b/interop/block-status-payload.c
> index 9603dfe5..704b25aa 100644
> --- a/interop/block-status-payload.c
> +++ b/interop/block-status-payload.c
> @@ -54,11 +54,26 @@ cb (void *opaque, const char *metacontext, uint64_t
> offset,
> return 0;
> }
>
> +static char **
> +list (unsigned int use)
> +{
> + static const char *array[ARRAY_SIZE (contexts) + 1];
> + size_t i, j;
> +
> + assert (use < 1 << ARRAY_SIZE (contexts));
> + for (i = j = 0; i < ARRAY_SIZE (contexts); i++)
> + if (use & (1 << i))
> + array[j++] = contexts[i];
> + array[j] = NULL;
> + return (char **) array;
> +}
> +
> int
> main (int argc, char *argv[])
> {
> struct nbd_handle *nbd;
> int64_t exportsize;
> + uint64_t bytes_sent;
> unsigned int seen;
> size_t i;
> int r;
> @@ -114,7 +129,107 @@ main (int argc, char *argv[])
> }
> assert (seen == 0xf);
>
> - /* FIXME: Test filtered calls once the API is added */
> + /* Filtering with all contexts listed, same effect as unfilitered call */
> + seen = 0;
> + if (nbd_block_status_filter (nbd, exportsize, 0, list (0xf),
> + (nbd_extent64_callback) { .callback = cb,
> + .user_data = &seen
> },
> + 0) == -1) {
> + fprintf (stderr, "%s\n", nbd_get_error ());
> + exit (EXIT_FAILURE);
> + }
> + assert (seen == 0xf);
> +
> + /* Filtering with just two out of four contexts; test optional flag */
> + seen = 0;
> + if (nbd_block_status_filter (nbd, exportsize, 0, list (0x5),
> + (nbd_extent64_callback) { .callback = cb,
> + .user_data = &seen
> },
> + LIBNBD_CMD_FLAG_PAYLOAD_LEN) == -1) {
> + fprintf (stderr, "%s\n", nbd_get_error ());
> + exit (EXIT_FAILURE);
> + }
> + assert (seen == 0x5);
> +
> + /* Filtering with one context, near end of file (to make sure the
> + * payload length isn't confused with the effect length)
> + */
> + seen = 0;
> + if (nbd_block_status_filter (nbd, 1, exportsize - 1, list (0x2),
> + (nbd_extent64_callback) { .callback = cb,
> + .user_data = &seen
> },
> + 0) == -1) {
> + fprintf (stderr, "%s\n", nbd_get_error ());
> + exit (EXIT_FAILURE);
> + }
> + assert (seen == 0x2);
> +
> + /* Filtering with no contexts - pointless, so qemu rejects it */
> + bytes_sent = nbd_stats_bytes_sent (nbd);
> + seen = 0;
> + if (nbd_block_status_filter (nbd, exportsize, 0, list (0x0),
> + (nbd_extent64_callback) { .callback = cb,
> + .user_data = &seen
> },
> + 0) != -1) {
> + fprintf (stderr, "expecting block status failure\n");
> + exit (EXIT_FAILURE);
> + }
> + assert (seen == 0x0);
> + if (nbd_get_errno () != EINVAL) {
> + fprintf (stderr, "expecting EINVAL after block status failure\n");
> + exit (EXIT_FAILURE);
> + }
> + if (nbd_stats_bytes_sent (nbd) <= bytes_sent) {
> + fprintf (stderr, "expecting server-side rejection of bad request\n");
> + exit (EXIT_FAILURE);
> + }
> +
> + /* Giving unknown string triggers EINVAL from libnbd */
> + bytes_sent = nbd_stats_bytes_sent (nbd);
> + seen = 0;
> + {
> + const char *bogus[] = { "qemu:dirty-bitmap:bitmap2", NULL };
> + if (nbd_block_status_filter (nbd, exportsize, 0, (char **) bogus,
> + (nbd_extent64_callback) { .callback = cb,
> + .user_data =
> &seen },
> + 0) != -1) {
> + fprintf (stderr, "expecting block status failure\n");
> + exit (EXIT_FAILURE);
> + }
> + }
> + if (nbd_get_errno () != EINVAL) {
> + fprintf (stderr, "expecting EINVAL after block status failure\n");
> + exit (EXIT_FAILURE);
> + }
> + assert (seen == 0x0);
> + if (nbd_stats_bytes_sent (nbd) != bytes_sent) {
> + fprintf (stderr, "expecting client-side rejection of bad request\n");
> + exit (EXIT_FAILURE);
> + }
> +
> + /* Giving same string twice triggers EINVAL from qemu */
> + seen = 0;
> + {
> + const char *dupes[] = { "base:allocation", "base:allocation", NULL };
> + if (nbd_block_status_filter (nbd, exportsize, 0, (char **) dupes,
> + (nbd_extent64_callback) { .callback = cb,
> + .user_data =
> &seen },
> + 0) != -1) {
> + fprintf (stderr, "expecting block status failure\n");
> + exit (EXIT_FAILURE);
> + }
> + }
> + if (nbd_get_errno () != EINVAL) {
> + fprintf (stderr, "expecting EINVAL after block status failure\n");
> + exit (EXIT_FAILURE);
> + }
> + assert (seen == 0x0);
> + if (nbd_stats_bytes_sent (nbd) <= bytes_sent) {
> + fprintf (stderr, "expecting server-side rejection of bad request\n");
> + exit (EXIT_FAILURE);
> + }
> +
> + /* Done */
> if (nbd_shutdown (nbd, 0) == -1) {
> fprintf (stderr, "%s\n", nbd_get_error ());
> exit (EXIT_FAILURE);
> diff --git a/interop/block-status-payload.sh b/interop/block-status-payload.sh
> index a12cfc8a..0e6681b6 100755
> --- a/interop/block-status-payload.sh
> +++ b/interop/block-status-payload.sh
> @@ -49,6 +49,7 @@ args = ["qemu-nbd", "-f", "qcow2", "-A", "-B", "bitmap0",
> "-B", "bitmap1",
> h.connect_systemd_socket_activation(args)
> assert h.aio_is_negotiating() is True
> assert h.get_extended_headers_negotiated() is False
> +
> # Flag not available until info or go
> try:
> h.can_block_status_payload()
> @@ -58,7 +59,18 @@ except nbd.Error:
> h.opt_info()
> assert h.can_block_status_payload() is False
> assert h.can_meta_context("base:allocation") is True
> -h.opt_abort()
> +
> +# Filter request not allowed if not advertised
> +def f():
> + assert False
> +h.opt_go()
> +assert h.can_block_status_payload() is False
> +try:
> + h.block_status_filter(0, 512, ["base:allocation"], f)
> + assert False
> +except nbd.Error:
> + pass
> +h.shutdown()
> '
>
> # Conditional part of test: if qemu is new enough to support extended
> diff --git a/info/info-can.sh b/info/info-can.sh
> index 8154d1ce..097837d2 100755
> --- a/info/info-can.sh
> +++ b/info/info-can.sh
> @@ -38,6 +38,9 @@ requires bash -c "nbdkit sh --dump-plugin | grep
> has_can_cache=1"
> # and oldstyle never, but that feels like depending a bit too much on
> # the implementation.
>
> +# --can block-status-payload is not supported by nbdkit yet. Testing
> +# is done during interop with new-enough qemu.
> +
> # --can structured-reply is not a per-export setting, but rather
> # something set on the server as a whole.
Seems generally OK, so:
Acked-by: Richard W.M. Jones <rjones@redhat.com>
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [Libguestfs] [libnbd PATCH v3 22/22] api: Add nbd_[aio_]block_status_filter(),
Richard W.M. Jones <=