qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 39/56] json: Leave rejecting invalid interpolati


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH 39/56] json: Leave rejecting invalid interpolation to parser
Date: Mon, 13 Aug 2018 11:12:13 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0

On 08/08/2018 07:03 AM, Markus Armbruster wrote:
Both lexer and parser reject invalid interpolation specifications.
The parser's check is useless.

The lexer ends the token right after the first bad character.  This
tends to lead to suboptimal error reporting.  For instance, input

     [ %11d ]

With no context of other characters on this line, it took me a while to notice that this was '1' (one) not 'l' (ell), even though my current font renders the two quite distinctly. (And now you know why base32 avoids 0/1 in its alphabet, to minimize confusion with O/l - see RFC 4648). A better example might be %22d.


produces the tokens

     JSON_LSQUARE  [
     JSON_ERROR    %1
     JSON_INTEGER  1

And that's in spite of the context you have here, which makes it obvious that the parser saw an integer.

     JSON_KEYWORD  d
     JSON_RSQUARE  ]

The parser then yields an error, an object and two more errors:

     error: Invalid JSON syntax
     object: 1
     error: JSON parse error, invalid keyword
     error: JSON parse error, expecting value

Change the lexer to accept [A-Za-z0-9]*[duipsf].  It now produces

That regex doesn't match the code.


     JSON_LSQUARE  [
     JSON_INTERPOLATION %11d
     JSON_RSQUARE  ]

and the parser reports just

     JSON parse error, invalid interpolation '%11d'

Signed-off-by: Markus Armbruster <address@hidden>
---
  qobject/json-lexer.c  | 52 +++++++++----------------------------------
  qobject/json-parser.c |  1 +
  2 files changed, 11 insertions(+), 42 deletions(-)

diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
index 0ea1eae4aa..7a82aab88b 100644
--- a/qobject/json-lexer.c
+++ b/qobject/json-lexer.c
@@ -93,7 +93,8 @@
   *   (apostrophe) instead of %x22 (quotation mark), and can't contain
   *   unescaped apostrophe, but can contain unescaped quotation mark.
   * - Interpolation:
- *   interpolation = %((l|ll|I64)[du]|[ipsf])
+ *   The lexer accepts [A-Za-z0-9]*, and leaves rejecting invalid ones
+ *   to the parser.

This comment is more apropos. But is it worth spelling "The lexer accepts %[A-Za-z0-9]*", and/or documenting that recognizing interpolation during lexing is now optional (thanks to the previous patch)?

@@ -278,6 +238,14 @@ static const uint8_t json_lexer[][256] =  {
          ['\n'] = IN_WHITESPACE,
      },
+ /* interpolation */
+    [IN_INTERPOL] = {
+        TERMINAL(JSON_INTERPOL),
+        ['A' ... 'Z'] = IN_INTERPOL,
+        ['a' ... 'z'] = IN_INTERPOL,
+        ['0' ... '9'] = IN_INTERPOL,
+    },
+

Note that we still treat code like "'foo': %#x" as an invalid interpolation request (it would be a valid printf format), while at the same time passing "%1" on to the parser (which is not a valid printf format, so -Wformat would have flagged it). In fact, "%d1" which is valid for printf as "%d" followed by a literal "1" gets thrown to the parser as one chunk - but it is not valid in JSON to have %d adjacent to another integer any more than it is to reject "%d1" as an unknown sequence. I don't think that catering to the remaining printf metacharacters (' ', '#', '\'', '-', '*', '+') nor demanding that things end on a letter is worth the effort, since it just makes the lexer more complicated when your goal was to do as little as possible to get things thrown over to the parser.

So even though your lexing is now somewhat different from printf, I don't see that as a serious drawback (the uses we care about are still -Wformat clean, and the uses we don't like are properly handled in the parser).

Perhaps you want to enhance the testsuite (earlier in the series) to cover "%22d", "%d1", "%1" as various unaccepted interpolation requests (if you didn't already).

Reviewed-by: Eric Blake <address@hidden>

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]