qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC: libyajl for JSON


From: Paolo Bonzini
Subject: Re: [Qemu-devel] RFC: libyajl for JSON
Date: Mon, 2 Nov 2015 14:47:35 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0


On 02/11/2015 13:56, Markus Armbruster wrote:
> A classical parser gets passed a source of characters (string, file
> descriptor, whatever), parses until it reaches a terminating state, then
> returns an abstract syntax tree (AST).  Basically a function mapping a
> character source to an AST.
> 
> Our JSON parser has inverted control flow: it gets fed characters on at
> a time, by the main loop via chardev IOReadHandler, and when it reaches
> a terminating state, it passes the parse result to a callback.

Yes, ours is a so-called "push" parser or streaming parser, and it's
actually more and more popular because it's easy to go push->pull but
much harder to build a push parser on top of a pull engine.

yajl does the same as QEMU already does.  It's main API is:

   YAJL_API yajl_status yajl_parse(yajl_handle hand,
                                   const unsigned char * jsonText,
                                   size_t jsonTextLength);

   YAJL_API yajl_status yajl_complete_parse(yajl_handle hand);

and actually these days you can even ask Bison to produce a push parser.

An interesting part of yajl is that it doesn't have an AST.  Instead the
user provides callbacks for integers, booleans, "start of map", "end of
map", etc.  So the "emit QObject" part of our parser, and especially the
need to maintain a stack, would remain.  Though perhaps we could use our
own callback abstraction (visitors) and reuse it to build QObjects with
only a simple mapping layer.  In that case the lack of AST would
actually be an advantage, or at least neutral.

> Except this is an oversimplification.  We actually have two parsers, a
> stupid one that can only count braces and brackets (json-streamer.c),
> and the real one (json-parser.c).  The stupid one passes a list of
> tokens to the callback, which runs the real parser to parse the tokens
> into a QObject (this is our AST), then does whatever needs to be done
> with the QObject.
> 
> But this is detail, the point remains that the current JSON parsing
> machinery gets fed characters and the code consuming the parse lives in
> a callback, and any replacement also needs to be fit into the main loop
> somehow.

Yeah, the streamer vs. parser distinction is a hack that avoids writing
everything in continuation-passing style (where the continuation is
basically the parser's state).  It's weird, but luckily it's not visible
to the ourside world.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]