qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC: libyajl for JSON


From: Eric Blake
Subject: Re: [Qemu-devel] RFC: libyajl for JSON
Date: Mon, 2 Nov 2015 08:46:58 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 11/02/2015 01:40 AM, Markus Armbruster wrote:
> Eric Blake <address@hidden> writes:
> 
>> Loaded question in response to
>> https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg06988.html, but
> 
> Discussion of our parser's enormous appetite for wasting RAM.  Fixable,
> but it's work, and it's not its only defect.
> 

>> On the other hand, one of the "features" of qemu's hand-rolled json
>> parser is the ability to do qobject_from_jsonf("{'foo':%s}", "bar")
>> (that is, we extended JSON with our notion of single-quoted strings, and
>> with our notion of %s and similar escape sequences for piecing together
>> multiple inputs into a single input stream without having to first
>> g_strdup_printf our pieces into a single string).  I don't know if
>> libyajl lets us add extensions to the language it parses.
> 
> Actually two separate extensions:
> 
> * Single-quoted strings
> 
>   The extension's purpose is avoiding quotes in C.  Example:
> 
>       "{ 'execute': 'migrate_set_speed', 'arguments': { 'value': 10 } }"
> 
>   is easier to read than
> 
>       "{ \"execute\": \"migrate_set_speed\", \"arguments\": { \"value\": 10 } 
> }"
> 
>   Most actual uses are in tests.

Except that we have documented that QMP clients may use it; and indeed:

$ printf "{'execute':'qmp_capabilities'}\n{'execute':'quit'}" | \
   ./x86_64-softmmu/qemu-system-x86_64 -nographic -nodefaults -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 4, "major": 2},
"package": ""}, "capabilities": []}}
{"return": {}}
{"return": {}}
{"timestamp": {"seconds": 1446478389, "microseconds": 265886}, "event":
"SHUTDOWN"}

So that is an absolute must for whatever parser we choose.  My first
glance at libyajl is that it does NOT currently allow single quotes (not
even with any of its existing options), so we'd have to pre-process
input before feeding it to yajl.

> 
>   JSON5, a proposed extension to JSON also supports single-quoted
>   strings.  So does Python.

It would be an interesting task to see if yajl would accept patches to
parse JSON5 as additional options (for example, yajl already has options
on whether to diagnose or ignore UTF8 errors, and on whether to allow /*
*/ javascript comments).  But then we'd be requiring an
as-yet-unreleased version of libyajl rather than being able to rely on
what distros already have, at least for a few years; or we'd have to
carry new-enough yajl as a git submodule within qemu.git.

I'll ask on the yajl mailing lists, to get a feel for community
receptiveness, before attempting to actually write patches on that front.

> 
> * Optional interpolation
> 
>   If you pass a va_list to the parser, it replaces certain escape
>   sequences by values from that va_list.  The escape sequences are a
>   subset of printf conversion specifiers, to enable compile-time
>   checking with __attribute__((format...)).
> 
>   We used to rely on this for creating "rich" error objects, but those
>   are long gone (commit df1e608).  Perhaps two dozen uses are left.

The testsuite is probably the biggest user, but we still have uses
throughout the code base.

Based on my look at libyajl, I think we CAN get away with stream
interpolation - yajl maintains state such that you do NOT have to feed
it the entire stream in one go.  So this one is not insurmountable; we
could rewrite our qobject_from_jsonf() to make multiple calls into yajl
without having to pre-format a single string.

> 
>   We could convert them to use "normal" interpolation, i.e. first format
>   the arguments as JSON, then interpolate with g_strdup_printf() or
>   similar, then parse.  Formatting only to parse it right back is
>   inelegant.  Also inefficient, but that probably doesn't matter here.

Or indeed, this is still an option.

> 
>   The current interface could be retained as convenience function to
>   interpolate and feed the result to the JSOn parser.
> 
>   Alternatively, we could build the QObject manually instead.  More
>   efficient than either kind of interpolation, but a good deal less
>   readable.

At any rate, it is certainly less of a show-stopper when compared to the
need for supporting single-quoted strings.

> 
> Got one more, actually a pitfall, not an extension:
> 
> * Representation of JSON numbers
> 
>   RFC 7159 doesn't specify how numbers are to be represented.
> 
>   Many JSON implementations represent any JSON number as double.  This
>   can represent signed integers up to 53 bits exactly.
> 
>   Our parser represents numbers as int64_t when possible, else as
>   double.
> 
>   Unlike the extensions above, this one is essential: any parser that
>   can't do it is useless for us.  Can yajl do it?

Yes; the yajl parser has 4 modes of parse operation based on which of
three callbacks you implement: double-only (yajl_double), long long-only
(yajl_integer), double-and-int (both yajl_double and yajl_integer, not
sure which one has precedence if input satisfies both formats), or
custom number (yajl_number, which is given a string, and you turn it
into the format of your choice).  Likewise, the yajl formatter has 2
functions: yajl_gen_double() (formats doubles) and yajl_gen_number()
(you provide literal strings formatted how you like).

> 
> More extensions or pitfalls might be lurking in our parser.  Therefore,
> replacing our parser by a suitable library is not without risk.  I guess
> we could do it over a full development cycle.  No way for 2.5.
> 

Absolutely not for 2.5; we've missed softfreeze, and this would count as
a new feature.

> If we decide to attempt replacing it in the next development cycle, we
> might want to refrain from fixing its bugs except for true show
> stoppers.
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]