[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] coroutine question, for NBD debugging
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] coroutine question, for NBD debugging |
Date: |
Sat, 4 Nov 2017 18:51:23 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 |
On 03/11/2017 21:03, Eric Blake wrote:
> In include/qemu/coroutine.h, we have:
>
> /**
> * Yield the coroutine for a given duration
> *
> * Behaves similarly to co_sleep_ns(), but the sleeping coroutine will be
> * resumed when using aio_poll().
> */
> void coroutine_fn co_aio_sleep_ns(AioContext *ctx, QEMUClockType type,
>
> but there is no co_sleep_ns() anywhere. What should the documentation
> actually state?
The old doc comment for co_sleep_ns was:
/**
* Yield the coroutine for a given duration
*
* Note this function uses timers and hence only works when a main loop is in
* use. See main-loop.h and do not use from qemu-tool programs.
*/
So probably:
This function uses timers and hence needs to know the event loop
(#AioContext) to place the timer on. In any case, co_aio_sleep_ns()
does not affect the #AioContext where the current coroutine is running,
as the coroutine will restart on the same #AioContext that it is
running on.
Thanks,
Paolo
>
> Meanwhile, I'm trying to figure out if there is an easy way to hack
> qemu-nbd to send two structured replies interleaved (as in A.1, B.1,
> A.2, B.2) to ensure the NBD client coroutines properly handle
> interleaved responses, but adding a plain sleep() between A.1 and A.2
> does not work, and I'm not sure whether co_aio_sleep_ns() would work for
> the hack instead.
>
> The hack I'm currently playing with splits reads and structured errors
> into two parts, attempting to use sleeps to force the server to send
> replies out of order:
>
> diff --git c/nbd/server.c w/nbd/server.c
> index bcf0cdb47c..4b642127bb 100644
> --- c/nbd/server.c
> +++ w/nbd/server.c
> @@ -1285,13 +1285,24 @@ static int coroutine_fn
> nbd_co_send_structured_read(NBDClient *client,
> {.iov_base = &chunk, .iov_len = sizeof(chunk)},
> {.iov_base = data, .iov_len = size}
> };
> + int ret;
>
> trace_nbd_co_send_structured_read(handle, offset, data, size);
> - set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_OFFSET_DATA,
> + set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_OFFSET_DATA,
> handle, sizeof(chunk) - sizeof(chunk.h) + size);
> stq_be_p(&chunk.offset, offset);
>
> - return nbd_co_send_iov(client, iov, 2, errp);
> + ret = nbd_co_send_iov(client, iov, 2, errp);
> + if (ret < 0) {
> + return ret;
> + }
> +
> + sleep(2);
> +
> + set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_NONE,
> + handle, 0);
> + iov[0].iov_len = sizeof(chunk.h);
> + return nbd_co_send_iov(client, iov, 1, errp);
> }
>
> static int coroutine_fn nbd_co_send_structured_error(NBDClient *client,
> @@ -1306,16 +1317,27 @@ static int coroutine_fn
> nbd_co_send_structured_error(NBDClient *client,
> {.iov_base = &chunk, .iov_len = sizeof(chunk)},
> {.iov_base = (char *)msg, .iov_len = msg ? strlen(msg) : 0},
> };
> + int ret;
>
> assert(nbd_err);
> trace_nbd_co_send_structured_error(handle, nbd_err,
> nbd_err_lookup(nbd_err), msg ?
> msg : "");
> - set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_ERROR,
> handle,
> + set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_ERROR, handle,
> sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len);
> stl_be_p(&chunk.error, nbd_err);
> stw_be_p(&chunk.message_length, iov[1].iov_len);
>
> - return nbd_co_send_iov(client, iov, 1 + !!iov[1].iov_len, errp);
> + ret = nbd_co_send_iov(client, iov, 1 + !!iov[1].iov_len, errp);
> + if (ret < 0) {
> + return ret;
> + }
> +
> + sleep(1);
> +
> + set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_NONE,
> + handle, 0);
> + iov[0].iov_len = sizeof(chunk.h);
> + return nbd_co_send_iov(client, iov, 1, errp);
> }
>
> /* nbd_co_receive_request
>
>
> coupled with a client that does:
>
> $ ./qemu-io -f raw nbd://localhost:10809/foo --trace='nbd_*' -c
> 'aio_write -P 3 1 1' -c 'aio_read -P 1 1 1'
>
> but the trace shows that the client does not receive B.1 until after it
> has blocked waiting for A.2, which is not what I was hoping to see.
>
signature.asc
Description: OpenPGP digital signature