[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH for-4.0] nbd: Permit simple error to NBD_CMD_BLO
From: |
Eric Blake |
Subject: |
Re: [Qemu-devel] [PATCH for-4.0] nbd: Permit simple error to NBD_CMD_BLOCK_STATUS |
Date: |
Mon, 25 Mar 2019 11:21:06 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.0 |
On 3/25/19 11:04 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>> +++ b/block/nbd-client.c
>>>> @@ -718,9 +718,7 @@ static int
>>>> nbd_co_receive_blockstatus_reply(NBDClientSession *s,
>>>> bool received = false;
>>>>
>>>> assert(!extent->length);
>>>> - NBD_FOREACH_REPLY_CHUNK(s, iter, handle, s->info.structured_reply,
>>>> - NULL, &reply, &payload)
>>>> - {
>>>> + NBD_FOREACH_REPLY_CHUNK(s, iter, handle, false, NULL, &reply,
>>>> &payload) {
>>
>> This part changes things to permit a simple reply if that reply is an
>> error. In turn, if the server sends a simple reply error,
>> iter.request_ret is set to the error returned by the server while
>> leaving iter.ret and iter.err unset (because we successfully got the
>> server's reply and can keep the connection alive), with commit 7f86068d
>> in play (prior to that commit, iter.ret was left 0 because the server
>> replied successfully, while iter.err was set to the set to the server's
>> error, while leaving iter.fatal unset).
If this hunk is not present, then NBD_FOREACH_REPLY_CHUNK sets iter.err
when encountering a simple error reply from the server (the server
violated the client's expectations by not sending a structured error,
give up on communicating with the server). When this hunk is added,
NBD_FOREACH_REPLY_CHUNK now sets iter.request_ret instead (the server
gave us an error reply, and we are happy with that whether it was an
simple or structured error).
>>
>>>> int ret;
>>>> NBDStructuredReplyChunk *chunk = &reply.structured;
>>>>
>>>> @@ -758,9 +756,11 @@ static int
>>>> nbd_co_receive_blockstatus_reply(NBDClientSession *s,
>>>> payload = NULL;
>>>> }
>>>>
>>>> - if (!extent->length && !iter.err) {
>>>> - error_setg(&iter.err,
>>>> - "Server did not reply with any status extents");
>>>> + if (!extent->length && !iter.request_ret) {
>>>
>>> Hmm, I don't see, what is changed. You just don't set ret and err if there
>>> already
>>> set request_ret.. So, finally it may lead to other return code in
>>> nbd_client_co_block_status
>>> and local_err will not be set and traced. But there is no connection close
>>> in this case
>>> in current code, s->quit is not set.
>>
>> And that's okay. The whole point of this is that if the server replied
>> with an error message (iter.request_ret), then it is okay that the
>> server did not give us any extent->length, and the caller will turn
>> iter.request_ret into the EINVAL reply (or whatever other errno value
>> the server sent) to .bdrv_co_block_status() while keeping the connection
>> alive. But if the server did NOT send an error reply (iter.request_ret
>> is 0), but also did not send us any status, then we need to turn that
>> into an error condition (because our caller expected us to make progress
>> on success, even though the server did not make progress).
>
>
> I just say, that I don't see "the client treats the server's
> reply as fatal and hangs up on the connection" in this place before your
> change.
> Or what you mean?
When you fix the first bug (the client setting iter.err on a simple
error reply from the server because the server's reply wasn't
structured, to now the client setting just iter.request_ret because it
successfully parsed an error out of the server's reply), this code is
now reachable with iter.err == NULL. Since the server replied with an
error, extent->length is 0. So this code (added in 7f86068d) says that
since iter.err is not set, we give up on the server and hang up (but
only for simple errors, not for structured errors). It is a regression,
because prior to 7f86068d, we did not have iter.request_ret, and instead
went by the presence or absence of iter.fatal, and iter.fatal was not
set when dealing with a simple error when a structured error was supplied.
So, the full setup of things to test:
server that always fails with structured error:
- client before 7f86068d (qemu 3.1): connection stays alive, regardless
of this patch
- client after 7f86068d: connection stays alive, regardless of this patch
server that always fails with simple error:
- client before 7f86068d: (second half of patch does not apply, so only
the first half is needed):
- without first half, connection dies because "Protocol error: simple
reply when structured reply chunk was expected"
- with first half, connection stays alive as it does for structured error
- client after 7f86068d, with various stages of this patch:
- without either half (or with second half only), connection dies
because "Protocol error: simple reply when structured reply chunk was
expected"
- with first half, connection dies because "server did not reply with
any status extents"
- with both halves, connection stays alive
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
signature.asc
Description: OpenPGP digital signature