qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v5 11/11] nbd: Minimal structured read for clien


From: Eric Blake
Subject: Re: [Qemu-block] [PATCH v5 11/11] nbd: Minimal structured read for client
Date: Fri, 20 Oct 2017 15:46:30 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0

On 10/20/2017 02:58 PM, Vladimir Sementsov-Ogievskiy wrote:
> 20.10.2017 01:26, Eric Blake wrote:
>> From: Vladimir Sementsov-Ogievskiy <address@hidden>
>>
>> Minimal implementation: for structured error only error_report error
>> message.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>> Signed-off-by: Eric Blake <address@hidden>
>>

>> +static int nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
>> +                                         uint8_t *payload,
>> QEMUIOVector *qiov,
>> +                                         Error **errp)
>> +{
>> +    uint64_t offset;
>> +    uint32_t hole_size;
>> +
>> +    if (chunk->length != sizeof(offset) + sizeof(hole_size)) {
>> +        error_setg(errp, "Protocol error: invalid payload for "
>> +                         "NBD_REPLY_TYPE_OFFSET_HOLE");
>> +        return -EINVAL;
>> +    }
>> +
>> +    offset = payload_advance64(&payload);
>> +    hole_size = payload_advance32(&payload);
>> +
>> +    if (offset > qiov->size - hole_size) {
> 
> hmm, you've changed order to prevent overflow.. can it overflow anyway
> through negative minimum on 32-bit system with some unhappy size and
> hole_size?

Good catch; qiov->size is size_t, hole_size is uint32_t.  qiov->size
happens to be bounded by NBD_MAX_BUFFER_SIZE (we wouldn't have sent the
request otherwise, to be getting the reply now), but hole_size came over
the wire and must be considered untrusted.  So you are correct that we
don't know that we avoided overflow unless we also compare the two
values, as in:

if (hole_size > qiov->size || offset > qiov->size - hole_size)

in both affected functions.


>> +    *request_ret = -error;
>> +    message_size = payload_advance16(&payload);
>> +
>> +    if (message_size > chunk->length - sizeof(error) -
>> sizeof(message_size)) {
>> +        error_setg(errp, "Protocol error: server sent structured
>> error chunk"
>> +                         "with incorrect message size");
>> +        return -EINVAL;
>> +    }
>> +
> 
> you removed my error message from errp, but didn't change comment..
> 
> And you lose the message.. At least add a "TODO" for it...
> 
>> +    /* TODO: Add a trace point to mention the server complaint */

...

>> +
>> +#define NBD_MAX_MALLOC_PAYLOAD 1000
>> +/* nbd_co_receive_structured_payload
>> + * The resulting pointer stored in @payload requires g_free() to free
>> it.
> 
> I think now it is an extra comment..
> (and all it's duplication)

True. g_new() is normal enough that the comment doesn't add as much (the
comment was important when we were using the unusual qemu_memalign(),
but we don't need that).


>> +
>> +    if (nbd_reply_type_is_error(chunk->type)) {
>> +        ret = nbd_parse_error_payload(chunk, local_payload,
>> request_ret, errp);
>> +        g_free(local_payload);
> 
> and error message is lost.. So you don't like an idea of saving it in errp?

...because storing in errp, but returning 0, is wrong.  We do NOT need
to be spamming stderr with every time the server replied with an error,
because that is NORMAL behavior that can (sometimes) be gracefully
recovered from - the most obvious situation is ENOSPC errors resulting
from write(), where we add the proposed NBD resize extension to allow
the client to resize the server and then try again.  Tracing server
messages might be useful, but reporting them indiscriminately is not.  I
dropped the setting of errp in these two places because of the
regression I mentioned in this mail:
https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg04796.html

where setting errp but returning 0 led to duplicate or
poorly-constructed messages.

In my testing, it worked best to set errp ONLY in the situations where
we are giving up all future communication with the server, and not
merely because the server reported an error but we are still fine to
keep talking to the server.


>> -    return nbd_co_receive_reply(client, request->handle,
>> -                                request->type == NBD_CMD_READ ? qiov
>> : NULL);
>> +    ret = nbd_co_receive_return_code(client, request->handle,
>> &local_err);
>> +    if (local_err) {
> 
> hmm, you changed it from "if (ret < 0)"... It doesn't matter, just note
> that here (local_err != NULL) <=> (ret < 0).

This was the trick to the errp stuff above: we HAVE to return a negative
errno code to the caller whether the connection was fatally wounded or
whether it is just reporting a server error code; so at this point, ret
< 0 is NOT equivalent to local_err != NULL.  I got a coredump if I used
'if (ret < 0)' on situations where the server replied with an error, but
where it was not fatal so we did not set local_err, because you can't
error_report_err(NULL).

> 
>> +        error_report_err(local_err);
>> +    }
>> +    return ret;
>>   }
> 
> [...]
> 
> 
> I'm ok with this, we can improve error handling later.

Indeed - I'd like to get as much in as possible for soft freeze, then
focus on polishing things before hard freeze.  I don't know if we'll get
block status in, or just structured read; and I'd really like to be able
to read holes, but one thing at a time...

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]