qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 10/18] qjson: report an error if there are mu


From: Marc-André Lureau
Subject: Re: [Qemu-devel] [PATCH v2 10/18] qjson: report an error if there are multiple results
Date: Fri, 20 Jul 2018 12:41:31 +0200

Hi

On Fri, Jul 20, 2018 at 10:49 AM, Markus Armbruster <address@hidden> wrote:
> Marc-André Lureau <address@hidden> writes:
>
>> qobject_from_jsonv() returns a single object. Let's make sure that
>> during parsing we don't leak an intermediary object. Instead of
>> returning the last object, set a parsing error.
>>
>> Also, the lexer/parser keeps consuming all the data. There might be an
>> error set earlier. Let's keep it and not call json_parser_parse() with
>> the remaining data. Eventually, we may teach the message parser to
>> stop consuming the data.
>
> Took me a while to figure out what you mean :)
>
> qobject_from_jsonv() feeds a complete string to the JSON parser, with
> json_message_parser_feed().  This actually feeds one character after the
> other to the lexer, via json_lexer_feed_char().
>
> Whenever a character completes a token, the lexer feeds that token to
> the streamer via a callback that is always json_message_process_token().
>
> The attentive reader may wonder what it does with trailing characters
> that aren't a complete token.  More on that below.
>
> The streamer accumulates tokens, counting various parenthesis.  When all
> counts are zero or any is negative, the group is complete, and is fed to
> the parser via another callback.  That callback is parse_json() here.
> There are more elsewhere.
>
> The attentive reader may wonder what it does with trailing tokens that
> aren't a complete group.  More on that below.
>
> The callbacks all pass the tokens to json_parser_parse() to do the
> actual parsing.  Returns the abstract syntax tree as QObject on success.
>
> Note that the callback can be called any number of times.
>
> In my opinion, this is over-engineered and under-thought.  There's more
> flexibility than we're using, and the associated complexity breeds bugs.
>
> In particular, parse_json() is wrong:
>
>     s->result = json_parser_parse(tokens, s->ap, &s->err);
>
> If the previous call succeeded, we leak its result.  This is the leak
> you mentioned.
>
> If any previous call set an error, we pass &s->err pointing to non-null,
> which is forbidden.  If json_parser_parse() passed it to error_setg(),
> this would crash.  Since it passes it only to error_propagate(), all
> errors but the first one are ignored.  Latent crash bug.
>
> If the last call succeeds and any previous call failed, the function
> simultaneously succeeds and fails: we return both a result and set an
> error.
>
> Let's demonstrate these bugs before we fix them, by inserting the
> appended patch before this one.
>
> Are the other callbacks similarly buggy?  Turns out they're okay:
>
> * handle_qmp_command() consumes result and error on each call.
>
> * process_event() does, too.
>
> * qmp_response() treats errors as fatal, and asserts "no previous
>   response".
>
> On trailing tokens that don't form a group: they get silently ignored,
> as demonstrated by check-qjson test cases unterminated_array(),
> unterminated_array_comma(), unterminated_dict(),
> unterminated_dict_comma(), all marked BUG.
>
> On trailing characters that don't form a token: they get silently
> ignored, as demonstrated by check-qjson test cases
> unterminated_string(), unterminated_sq_string(), unterminated_escape(),
> all marked BUG, except when they aren't, as in test case
> unterminated_literal().
>
> The BUG marks are all gone at the end of this series.  I guess you're
> fixing all that :)
>
> Note that these "trailing characters / tokens are silently ignored" bugs
> *do* affect the other callbacks.  Reproducer for handle_qmp_command():
>
>     $ echo -e '{ "execute": "qmp_capabilities" }\n{ "execute": "query-name" 
> }\n"unterminated' | socat UNIX:test-qmp STDIO
>     {"QMP": {"version": {"qemu": {"micro": 90, "minor": 12, "major": 2}, 
> "package": "v3.0.0-rc1-20-g6a024cd461"}, "capabilities": ["oob"]}}
>     {"return": {}}
>     {"return": {}}
>
> Note there's no error reported for the last line.
>
>> Signed-off-by: Marc-André Lureau <address@hidden>
>> ---
>>  qobject/qjson.c     | 16 +++++++++++++++-
>>  tests/check-qjson.c | 11 +++++++++++
>>  2 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/qobject/qjson.c b/qobject/qjson.c
>> index ee04e61096..8a9d116150 100644
>> --- a/qobject/qjson.c
>> +++ b/qobject/qjson.c
>> @@ -22,6 +22,7 @@
>>  #include "qapi/qmp/qlist.h"
>>  #include "qapi/qmp/qnum.h"
>>  #include "qapi/qmp/qstring.h"
>> +#include "qapi/qmp/qerror.h"
>>  #include "qemu/unicode.h"
>>
>>  typedef struct JSONParsingState
>> @@ -36,7 +37,20 @@ static void parse_json(JSONMessageParser *parser, GQueue 
>> *tokens)
>>  {
>>      JSONParsingState *s = container_of(parser, JSONParsingState, parser);
>>
>> -    s->result = json_parser_parse(tokens, s->ap, &s->err);
>> +    if (s->result || s->err) {
>> +        if (s->result) {
>> +            qobject_unref(s->result);
>> +            s->result = NULL;
>> +            if (!s->err) {
>
> Conditional is necessary because a previous call to json_parser_parse()
> could have set s->err already.  Without it, error_setg() would fail the
> assertion in error_setv() then.  Subtle.
>
>> +                error_setg(&s->err, QERR_JSON_PARSING);
>
> Hmm.  Is this really "Invalid JSON syntax"?  In a way, it is, because
> RFC 7159 defines JSON-text as a single value.  Still, a friendlier error
> message like "Expecting at most one JSON value" woun't hurt.
>
> What if !tokens?  That shouldn't be an error.
>
>> +            }
>> +        }
>> +        if (tokens) {
>> +            g_queue_free_full(tokens, g_free);
>
> g_queue_free_full(NULL, ...) doesn't work?!?  That would be lame.

It warns and return.

We could use g_clear_pointer(&tokens, g_queue_free_full)...

>
>> +        }
>> +    } else {
>> +        s->result = json_parser_parse(tokens, s->ap, &s->err);
>> +    }
>>  }
>
> What about this:
>
>        JSONParsingState *s = container_of(parser, JSONParsingState, parser);
>        Error *err = NULL;
>
>        if (!tokens) {
>            return;
>        }

I would rather leave handling of NULL tokens to json_parser_parse()
for consistency with other callers.

>        if (s->result || s->err) {
>            if (s->result) {
>                qobject_unref(s->result);
>                s->result = NULL;
>                error_setg(&err, "Expecting at most one JSON value");
>            }
>            g_queue_free_full(tokens, g_free);

missing null check

>        } else {
>            s->result = json_parser_parse(tokens, s->ap, &err);
>        }
>        error_propagate(&s->err, err);

how do you ensure s->err is not already set?

>        assert(!s->result != !s->err);
>
>>
>>  QObject *qobject_from_jsonv(const char *string, va_list *ap, Error **errp)
>> diff --git a/tests/check-qjson.c b/tests/check-qjson.c
>> index da582df3e9..7d3547e0cc 100644
>> --- a/tests/check-qjson.c
>> +++ b/tests/check-qjson.c
>> @@ -1417,6 +1417,16 @@ static void limits_nesting(void)
>>      g_assert(obj == NULL);
>>  }
>>
>> +static void multiple_objects(void)
>> +{
>> +    Error *err = NULL;
>> +    QObject *obj;
>> +
>> +    obj = qobject_from_json("{} {}", &err);
>> +    g_assert(obj == NULL);
>> +    error_free_or_abort(&err);
>> +}
>> +
>>  int main(int argc, char **argv)
>>  {
>>      g_test_init(&argc, &argv, NULL);
>> @@ -1454,6 +1464,7 @@ int main(int argc, char **argv)
>>      g_test_add_func("/errors/invalid_dict_comma", invalid_dict_comma);
>>      g_test_add_func("/errors/unterminated/literal", unterminated_literal);
>>      g_test_add_func("/errors/limits/nesting", limits_nesting);
>> +    g_test_add_func("/errors/multiple_objects", multiple_objects);
>>
>>      return g_test_run();
>>  }
>
> From 18064b774ea7a79a9dbabe5cf75458f0326070d5 Mon Sep 17 00:00:00 2001
> From: Markus Armbruster <address@hidden>
> Date: Fri, 20 Jul 2018 10:22:41 +0200
> Subject: [PATCH] check-qjson: Cover multiple JSON objects in same string
>
> qobject_from_json() & friends misbehave when the JSON text has more
> than one JSON value.  Add test coverage to demonstrate the bugs.
>
> Signed-off-by: Markus Armbruster <address@hidden>

test looks better than mine, should I include it in the series before the fix?

> ---
>  tests/check-qjson.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/tests/check-qjson.c b/tests/check-qjson.c
> index da582df3e9..c5fd439263 100644
> --- a/tests/check-qjson.c
> +++ b/tests/check-qjson.c
> @@ -1417,6 +1417,25 @@ static void limits_nesting(void)
>      g_assert(obj == NULL);
>  }
>
> +static void multiple_objects(void)
> +{
> +    Error *err = NULL;
> +    QObject *obj;
> +
> +    /* BUG this leaks the syntax tree for "false" */
> +    obj = qobject_from_json("false true", &err);
> +    g_assert(qbool_get_bool(qobject_to(QBool, obj)));
> +    g_assert(!err);
> +    qobject_unref(obj);
> +
> +    /* BUG simultaneously succeeds and fails */
> +    /* BUG calls json_parser_parse() with errp pointing to non-null */
> +    obj = qobject_from_json("} true", &err);
> +    g_assert(qbool_get_bool(qobject_to(QBool, obj)));
> +    error_free_or_abort(&err);
> +    qobject_unref(obj);
> +}
> +
>  int main(int argc, char **argv)
>  {
>      g_test_init(&argc, &argv, NULL);
> @@ -1454,6 +1473,7 @@ int main(int argc, char **argv)
>      g_test_add_func("/errors/invalid_dict_comma", invalid_dict_comma);
>      g_test_add_func("/errors/unterminated/literal", unterminated_literal);
>      g_test_add_func("/errors/limits/nesting", limits_nesting);
> +    g_test_add_func("/errors/multiple_objects", multiple_objects);
>
>      return g_test_run();
>  }
> --
> 2.17.1
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]