[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] RFC: libyajl for JSON
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] RFC: libyajl for JSON |
Date: |
Mon, 02 Nov 2015 20:10:33 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
Eric Blake <address@hidden> writes:
> On 11/02/2015 01:40 AM, Markus Armbruster wrote:
>> Eric Blake <address@hidden> writes:
>>
>>> Loaded question in response to
>>> https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg06988.html, but
>>
>> Discussion of our parser's enormous appetite for wasting RAM. Fixable,
>> but it's work, and it's not its only defect.
>>
>
>>> On the other hand, one of the "features" of qemu's hand-rolled json
>>> parser is the ability to do qobject_from_jsonf("{'foo':%s}", "bar")
>>> (that is, we extended JSON with our notion of single-quoted strings, and
>>> with our notion of %s and similar escape sequences for piecing together
>>> multiple inputs into a single input stream without having to first
>>> g_strdup_printf our pieces into a single string). I don't know if
>>> libyajl lets us add extensions to the language it parses.
>>
>> Actually two separate extensions:
>>
>> * Single-quoted strings
>>
>> The extension's purpose is avoiding quotes in C. Example:
>>
>> "{ 'execute': 'migrate_set_speed', 'arguments': { 'value': 10 } }"
>>
>> is easier to read than
>>
>> "{ \"execute\": \"migrate_set_speed\", \"arguments\": { \"value\": 10
>> } }"
>>
>> Most actual uses are in tests.
>
> Except that we have documented that QMP clients may use it; and indeed:
>
> $ printf "{'execute':'qmp_capabilities'}\n{'execute':'quit'}" | \
> ./x86_64-softmmu/qemu-system-x86_64 -nographic -nodefaults -qmp stdio
> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 4, "major": 2},
> "package": ""}, "capabilities": []}}
> {"return": {}}
> {"return": {}}
> {"timestamp": {"seconds": 1446478389, "microseconds": 265886}, "event":
> "SHUTDOWN"}
Unnecessary mistake :(
> So that is an absolute must for whatever parser we choose. My first
> glance at libyajl is that it does NOT currently allow single quotes (not
> even with any of its existing options), so we'd have to pre-process
> input before feeding it to yajl.
I'm not sure reusing an existing parser makes much sense if we need to
preprocess anyway. After all, parsing JSON isn't exactly a moon shot,
at least if you know what you're doing.
>> JSON5, a proposed extension to JSON also supports single-quoted
>> strings. So does Python.
>
> It would be an interesting task to see if yajl would accept patches to
> parse JSON5 as additional options (for example, yajl already has options
> on whether to diagnose or ignore UTF8 errors, and on whether to allow /*
> */ javascript comments). But then we'd be requiring an
> as-yet-unreleased version of libyajl rather than being able to rely on
> what distros already have, at least for a few years; or we'd have to
> carry new-enough yajl as a git submodule within qemu.git.
>
> I'll ask on the yajl mailing lists, to get a feel for community
> receptiveness, before attempting to actually write patches on that front.
Makes sense.
>> * Optional interpolation
>>
>> If you pass a va_list to the parser, it replaces certain escape
>> sequences by values from that va_list. The escape sequences are a
>> subset of printf conversion specifiers, to enable compile-time
>> checking with __attribute__((format...)).
>>
>> We used to rely on this for creating "rich" error objects, but those
>> are long gone (commit df1e608). Perhaps two dozen uses are left.
>
> The testsuite is probably the biggest user, but we still have uses
> throughout the code base.
>
> Based on my look at libyajl, I think we CAN get away with stream
> interpolation - yajl maintains state such that you do NOT have to feed
> it the entire stream in one go. So this one is not insurmountable; we
> could rewrite our qobject_from_jsonf() to make multiple calls into yajl
> without having to pre-format a single string.
Sounds complicated...
>> We could convert them to use "normal" interpolation, i.e. first format
>> the arguments as JSON, then interpolate with g_strdup_printf() or
>> similar, then parse. Formatting only to parse it right back is
>> inelegant. Also inefficient, but that probably doesn't matter here.
>
> Or indeed, this is still an option.
>
>>
>> The current interface could be retained as convenience function to
>> interpolate and feed the result to the JSOn parser.
>>
>> Alternatively, we could build the QObject manually instead. More
>> efficient than either kind of interpolation, but a good deal less
>> readable.
>
> At any rate, it is certainly less of a show-stopper when compared to the
> need for supporting single-quoted strings.
>
>> Got one more, actually a pitfall, not an extension:
>>
>> * Representation of JSON numbers
>>
>> RFC 7159 doesn't specify how numbers are to be represented.
>>
>> Many JSON implementations represent any JSON number as double. This
>> can represent signed integers up to 53 bits exactly.
>>
>> Our parser represents numbers as int64_t when possible, else as
>> double.
>>
>> Unlike the extensions above, this one is essential: any parser that
>> can't do it is useless for us. Can yajl do it?
>
> Yes; the yajl parser has 4 modes of parse operation based on which of
> three callbacks you implement: double-only (yajl_double), long long-only
> (yajl_integer), double-and-int (both yajl_double and yajl_integer, not
> sure which one has precedence if input satisfies both formats), or
> custom number (yajl_number, which is given a string, and you turn it
> into the format of your choice). Likewise, the yajl formatter has 2
> functions: yajl_gen_double() (formats doubles) and yajl_gen_number()
> (you provide literal strings formatted how you like).
Good.
>> More extensions or pitfalls might be lurking in our parser. Therefore,
>> replacing our parser by a suitable library is not without risk. I guess
>> we could do it over a full development cycle. No way for 2.5.
>>
>
> Absolutely not for 2.5; we've missed softfreeze, and this would count as
> a new feature.
>
>> If we decide to attempt replacing it in the next development cycle, we
>> might want to refrain from fixing its bugs except for true show
>> stoppers.
>>
- Re: [Qemu-devel] RFC: libyajl for JSON, Markus Armbruster, 2015/11/02
- Re: [Qemu-devel] RFC: libyajl for JSON, Paolo Bonzini, 2015/11/02
- Re: [Qemu-devel] RFC: libyajl for JSON, Stefan Hajnoczi, 2015/11/02
- Re: [Qemu-devel] RFC: libyajl for JSON, Eric Blake, 2015/11/02
- Re: [Qemu-devel] RFC: libyajl for JSON,
Markus Armbruster <=
- Re: [Qemu-devel] RFC: libyajl for JSON, Eric Blake, 2015/11/02
- Re: [Qemu-devel] RFC: libyajl for JSON, Markus Armbruster, 2015/11/03
- Re: [Qemu-devel] RFC: libyajl for JSON, Luiz Capitulino, 2015/11/03
- Re: [Qemu-devel] RFC: libyajl for JSON, Paolo Bonzini, 2015/11/03
- Re: [Qemu-devel] RFC: libyajl for JSON, Luiz Capitulino, 2015/11/03
- Re: [Qemu-devel] RFC: libyajl for JSON, Paolo Bonzini, 2015/11/03
- Re: [Qemu-devel] RFC: libyajl for JSON, Luiz Capitulino, 2015/11/03
- Re: [Qemu-devel] RFC: libyajl for JSON, Paolo Bonzini, 2015/11/03
- Re: [Qemu-devel] RFC: libyajl for JSON, Markus Armbruster, 2015/11/03
- Re: [Qemu-devel] RFC: libyajl for JSON, Eric Blake, 2015/11/03