qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 10/18] qjson: report an error if there are mu


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v2 10/18] qjson: report an error if there are multiple results
Date: Mon, 23 Jul 2018 07:34:32 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Marc-André Lureau <address@hidden> writes:

> Hi
>
> On Fri, Jul 20, 2018 at 10:49 AM, Markus Armbruster <address@hidden> wrote:
>> Marc-André Lureau <address@hidden> writes:
>>
>>> qobject_from_jsonv() returns a single object. Let's make sure that
>>> during parsing we don't leak an intermediary object. Instead of
>>> returning the last object, set a parsing error.
>>>
>>> Also, the lexer/parser keeps consuming all the data. There might be an
>>> error set earlier. Let's keep it and not call json_parser_parse() with
>>> the remaining data. Eventually, we may teach the message parser to
>>> stop consuming the data.
>>
>> Took me a while to figure out what you mean :)
>>
>> qobject_from_jsonv() feeds a complete string to the JSON parser, with
>> json_message_parser_feed().  This actually feeds one character after the
>> other to the lexer, via json_lexer_feed_char().
>>
>> Whenever a character completes a token, the lexer feeds that token to
>> the streamer via a callback that is always json_message_process_token().
>>
>> The attentive reader may wonder what it does with trailing characters
>> that aren't a complete token.  More on that below.
>>
>> The streamer accumulates tokens, counting various parenthesis.  When all
>> counts are zero or any is negative, the group is complete, and is fed to
>> the parser via another callback.  That callback is parse_json() here.
>> There are more elsewhere.
>>
>> The attentive reader may wonder what it does with trailing tokens that
>> aren't a complete group.  More on that below.
>>
>> The callbacks all pass the tokens to json_parser_parse() to do the
>> actual parsing.  Returns the abstract syntax tree as QObject on success.
>>
>> Note that the callback can be called any number of times.
>>
>> In my opinion, this is over-engineered and under-thought.  There's more
>> flexibility than we're using, and the associated complexity breeds bugs.

Two obvious simplifications come to mind:

* Call json_message_process_token() directly.

* Move the call json_parser_parse() into the streamer, and have the
  streamer pass QObject * instead of GQueue * to its callback.

I'll post patches for your consideration, but first I'll continue to
review your series.

>> In particular, parse_json() is wrong:
>>
>>     s->result = json_parser_parse(tokens, s->ap, &s->err);
>>
>> If the previous call succeeded, we leak its result.  This is the leak
>> you mentioned.
>>
>> If any previous call set an error, we pass &s->err pointing to non-null,
>> which is forbidden.  If json_parser_parse() passed it to error_setg(),
>> this would crash.  Since it passes it only to error_propagate(), all
>> errors but the first one are ignored.  Latent crash bug.
>>
>> If the last call succeeds and any previous call failed, the function
>> simultaneously succeeds and fails: we return both a result and set an
>> error.
>>
>> Let's demonstrate these bugs before we fix them, by inserting the
>> appended patch before this one.
>>
>> Are the other callbacks similarly buggy?  Turns out they're okay:
>>
>> * handle_qmp_command() consumes result and error on each call.
>>
>> * process_event() does, too.
>>
>> * qmp_response() treats errors as fatal, and asserts "no previous
>>   response".
>>
>> On trailing tokens that don't form a group: they get silently ignored,
>> as demonstrated by check-qjson test cases unterminated_array(),
>> unterminated_array_comma(), unterminated_dict(),
>> unterminated_dict_comma(), all marked BUG.
>>
>> On trailing characters that don't form a token: they get silently
>> ignored, as demonstrated by check-qjson test cases
>> unterminated_string(), unterminated_sq_string(), unterminated_escape(),
>> all marked BUG, except when they aren't, as in test case
>> unterminated_literal().
>>
>> The BUG marks are all gone at the end of this series.  I guess you're
>> fixing all that :)
>>
>> Note that these "trailing characters / tokens are silently ignored" bugs
>> *do* affect the other callbacks.  Reproducer for handle_qmp_command():
>>
>>     $ echo -e '{ "execute": "qmp_capabilities" }\n{ "execute": "query-name" 
>> }\n"unterminated' | socat UNIX:test-qmp STDIO
>>     {"QMP": {"version": {"qemu": {"micro": 90, "minor": 12, "major": 2}, 
>> "package": "v3.0.0-rc1-20-g6a024cd461"}, "capabilities": ["oob"]}}
>>     {"return": {}}
>>     {"return": {}}
>>
>> Note there's no error reported for the last line.
>>
>>> Signed-off-by: Marc-André Lureau <address@hidden>
>>> ---
>>>  qobject/qjson.c     | 16 +++++++++++++++-
>>>  tests/check-qjson.c | 11 +++++++++++
>>>  2 files changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qobject/qjson.c b/qobject/qjson.c
>>> index ee04e61096..8a9d116150 100644
>>> --- a/qobject/qjson.c
>>> +++ b/qobject/qjson.c
>>> @@ -22,6 +22,7 @@
>>>  #include "qapi/qmp/qlist.h"
>>>  #include "qapi/qmp/qnum.h"
>>>  #include "qapi/qmp/qstring.h"
>>> +#include "qapi/qmp/qerror.h"
>>>  #include "qemu/unicode.h"
>>>
>>>  typedef struct JSONParsingState
>>> @@ -36,7 +37,20 @@ static void parse_json(JSONMessageParser *parser, GQueue 
>>> *tokens)
>>>  {
>>>      JSONParsingState *s = container_of(parser, JSONParsingState, parser);
>>>
>>> -    s->result = json_parser_parse(tokens, s->ap, &s->err);
>>> +    if (s->result || s->err) {
>>> +        if (s->result) {
>>> +            qobject_unref(s->result);
>>> +            s->result = NULL;
>>> +            if (!s->err) {
>>
>> Conditional is necessary because a previous call to json_parser_parse()
>> could have set s->err already.  Without it, error_setg() would fail the
>> assertion in error_setv() then.  Subtle.
>>
>>> +                error_setg(&s->err, QERR_JSON_PARSING);
>>
>> Hmm.  Is this really "Invalid JSON syntax"?  In a way, it is, because
>> RFC 7159 defines JSON-text as a single value.  Still, a friendlier error
>> message like "Expecting at most one JSON value" woun't hurt.
>>
>> What if !tokens?  That shouldn't be an error.
>>
>>> +            }
>>> +        }
>>> +        if (tokens) {
>>> +            g_queue_free_full(tokens, g_free);
>>
>> g_queue_free_full(NULL, ...) doesn't work?!?  That would be lame.
>
> It warns and return.

Lame.

> We could use g_clear_pointer(&tokens, g_queue_free_full)...

Slightly terser, but the reader better knows g_clear_pointer().  So far,
we don't use it.

>>> +        }
>>> +    } else {
>>> +        s->result = json_parser_parse(tokens, s->ap, &s->err);
>>> +    }
>>>  }
>>
>> What about this:
>>
>>        JSONParsingState *s = container_of(parser, JSONParsingState, parser);
>>        Error *err = NULL;
>>
>>        if (!tokens) {
>>            return;
>>        }
>
> I would rather leave handling of NULL tokens to json_parser_parse()
> for consistency with other callers.

Fair point.

I expect the issue to evaporate when I simplify the parser interface.

>>        if (s->result || s->err) {
>>            if (s->result) {
>>                qobject_unref(s->result);
>>                s->result = NULL;
>>                error_setg(&err, "Expecting at most one JSON value");
>>            }
>>            g_queue_free_full(tokens, g_free);
>
> missing null check

@tokens can't be null here.

>>        } else {
>>            s->result = json_parser_parse(tokens, s->ap, &err);
>>        }
>>        error_propagate(&s->err, err);
>
> how do you ensure s->err is not already set?

error_propagate() is fine with that, unlike error_setg().

>>        assert(!s->result != !s->err);
>>
>>>
>>>  QObject *qobject_from_jsonv(const char *string, va_list *ap, Error **errp)
>>> diff --git a/tests/check-qjson.c b/tests/check-qjson.c
>>> index da582df3e9..7d3547e0cc 100644
>>> --- a/tests/check-qjson.c
>>> +++ b/tests/check-qjson.c
>>> @@ -1417,6 +1417,16 @@ static void limits_nesting(void)
>>>      g_assert(obj == NULL);
>>>  }
>>>
>>> +static void multiple_objects(void)
>>> +{
>>> +    Error *err = NULL;
>>> +    QObject *obj;
>>> +
>>> +    obj = qobject_from_json("{} {}", &err);
>>> +    g_assert(obj == NULL);
>>> +    error_free_or_abort(&err);
>>> +}
>>> +
>>>  int main(int argc, char **argv)
>>>  {
>>>      g_test_init(&argc, &argv, NULL);
>>> @@ -1454,6 +1464,7 @@ int main(int argc, char **argv)
>>>      g_test_add_func("/errors/invalid_dict_comma", invalid_dict_comma);
>>>      g_test_add_func("/errors/unterminated/literal", unterminated_literal);
>>>      g_test_add_func("/errors/limits/nesting", limits_nesting);
>>> +    g_test_add_func("/errors/multiple_objects", multiple_objects);
>>>
>>>      return g_test_run();
>>>  }
>>
>> From 18064b774ea7a79a9dbabe5cf75458f0326070d5 Mon Sep 17 00:00:00 2001
>> From: Markus Armbruster <address@hidden>
>> Date: Fri, 20 Jul 2018 10:22:41 +0200
>> Subject: [PATCH] check-qjson: Cover multiple JSON objects in same string
>>
>> qobject_from_json() & friends misbehave when the JSON text has more
>> than one JSON value.  Add test coverage to demonstrate the bugs.
>>
>> Signed-off-by: Markus Armbruster <address@hidden>
>
> test looks better than mine, should I include it in the series before the fix?

Yes, please.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]