qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 01/17] qidl: add QEMU IDL processor


From: Michael Roth
Subject: Re: [Qemu-devel] [PATCH 01/17] qidl: add QEMU IDL processor
Date: Tue, 5 Jun 2012 16:11:28 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Jun 05, 2012 at 01:06:10PM +0300, Avi Kivity wrote:
> On 06/05/2012 04:00 AM, Michael Roth wrote:
> > This is an import of Anthony's qidl compiler, with some changes squashed
> > in to add support for doing the visitor generation via QEMU's qapi code
> > generators rather than directly.
> > 
> > Documentation has been imported as well, as is also viewable at:
> > 
> > https://github.com/aliguori/qidl/blob/master/qc.md
> > 
> > This will be used to add annotations to device structs to aid in
> > generating visitors that can be used to serialize/unserialize them.
> > 
> > +
> > +    typedef struct SerialDevice {
> > +        SysBusDevice parent;
> > +    
> > +        uint8_t thr;            // transmit holding register
> > +        uint8_t lsr;            // line status register
> > +        uint8_t ier;            // interrupt enable register
> > +    
> > +        int int_pending;        // whether we have a pending queued 
> > interrupt
> > +        CharDriverState *chr;   // backend
> > +    } SerialDevice;
> > +
> > +Getting Started
> > +---------------
> > +
> > +The first step is to move your device struct definition to a header file.  
> > This
> > +header file should only contain the struct definition and any preprocessor
> > +declarations you need to define the structure.  This header file will act 
> > as
> > +the source for the QC IDL compiler.
> > +
> 
> Is is possible to let the compiler process the .c file, with the IDL
> delimited by some marker?  I like how device models are self contained
> in one file now.

It's possible, but only if we inject the generated visitor code into the
devices via an #include "qapi-generated/<device>-qapi-visit.c";

I'm not sure how acceptable that is... but it does reduce the churn
quite a bit.

> 
> 
> > +Do not include any function declarations in this header file as QC does not
> > +understand function declarations.
> > +
> > +Determining What State Gets Saved
> > +---------------------------------
> > +
> > +By default, QC saves every field in a structure it sees.  This provides 
> > maximum
> > +correctness by default.  However, device structures generally contain state
> > +that reflects state that is in someway duplicated or not guest visible.  
> > This
> > +more often that not reflects design implementation details.
> > +
> > +Since design implementation details change over time, saving this state 
> > makes
> > +compatibility hard to maintain since it would effectively lock down a 
> > device's
> > +implementation.
> > +
> > +QC allows a device author to suppress certain fields from being saved 
> > although
> > +there are very strict rules about when this is allowed and what needs to 
> > be done
> > +to ensure that this does not impact correctness.
> > +
> > +There are three cases where state can be suppressed: when it is 
> > **immutable**,
> > +**derived**, or **broken**.  
> 
> There is a fourth class, non-guest-visible state (below).  There is a
> fifth class, migrated by other means, which includes memory and block
> device state, but of course it isn't interesting in this context.

There's a higher-level annotation, qc_declaration, which denotes what
devices/structs should be processed by the QIDL compiler (and follow
it's rules). So there's an implied "handled by other means" for
everything that falls outside this category.

> 
> In addition, QC can decide at run time whether to
> > +suppress a field by assigning it a **default** value.
> > +
> > +## Immutable Fields
> > +
> > +If a field is only set during device construction, based on parameters 
> > passed to
> > +the device's constructor, then there is no need to send save and restore 
> > this
> > +value.  We call these fields immutable and we tell QC about this fact by 
> > using
> > +a **_immutable** marker.
> > +
> > +In our *SerialDevice* example, the *CharDriverState* pointer reflects the 
> > host
> > +backend that we use to send serial output to the user.  This is only 
> > assigned
> > +during device construction and never changes.  This means we can add an
> > +**_immutable** marker to it:
> 
> Even if it does change (suppose we add a monitor command to retarget a
> the serial device's chardev), it need not be migrated since it doesn't
> describe guest state, only host state.  Maybe we should mark *chr _host
> instead of _immutable.
> 
> > +
> > +    typedef struct SerialDevice {
> > +        SysBusDevice parent;
> > +    
> > +        uint8_t thr;            // transmit holding register
> > +        uint8_t lsr;            // line status register
> > +        uint8_t ier;            // interrupt enable register
> > +    
> > +        int int_pending;        // whether we have a pending queued 
> > interrupt
> > +        CharDriverState _immutable *chr;
> > +    } SerialDevice;
> > +
> > +When reviewing patches that make use of the **_immutable** marker, the 
> > following
> > +guidelines should be followed to determine if the marker is being used
> > +correctly.
> > +
> > + 1. Check to see if the field is assigned anywhere other than the device
> > +    initialization function.
> > +
> > + 2. Check to see if any function is being called that modifies the state 
> > of the
> > +    field outside of the initialization function.
> > +
> > +It can be subtle whether a field is truly immutable.  A good example is a
> > +*QEMUTimer*.  Timer's will usually have their timeout modified with a call 
> > to
> > +*qemu_mod_timer()* even though they are only assigned in the device
> > +initialization function.
> 
> I think this is where _host is useful.  You can have two QEMUTimers, one
> driven by the guest and the other by the host (like, say, the display
> refresh timer).  You would want to migrate one but not the other.
> 
> > +
> > +If the timer is always modified with a fixed value that is not dependent on
> > +guest state, then the timer is immutable since it's unaffected by the 
> > state of
> > +the guest.
> > +
> > +On the other hand, if the timer is modified based on guest state (such as a
> > +guest programmed time out), then the timer carries state.  It may be 
> > necessary
> > +to save/restore the timer or mark it as **_derived** and work with it
> > +accordingly.
> > +
> > +### Derived Fields
> > +
> > +If a field is set based on some other field in the device's structure, 
> > then its
> > +value is derived.  Since this is effectively duplicate state, we can avoid
> > +sending it and then recompute it when we need to.  Derived state requires 
> > a bit
> > +more handling that immutable state.
> > +
> > +In our *SerialDevice* example, our *int_pending* flag is really derived 
> > from
> > +two pieces of state.  It is set based on whether interrupts are enabled in 
> > the
> > +*ier* register and whether there is *THRE* flag is not set in the *lsr*
> > +register.
> 
> Device model authors should be encouraged to avoid derived state in
> simple cases like that, instead use a function.
> 
> > +
> > +To mark a field as derived, use the **_derived** marker.  To update our
> > +example, we would do:
> > +
> > +    typedef struct SerialDevice {
> > +        SysBusDevice parent;
> > +    
> > +        uint8_t thr;            // transmit holding register
> > +        uint8_t lsr;            // line status register
> > +        uint8_t ier;            // interrupt enable register
> > +    
> > +        int _derived int_pending; // whether we have a pending queued 
> > interrupt
> > +        CharDriverState _immutable *chr;
> > +    } SerialDevice;
> > +
> > +There is one other critical step needed when marking a field as derived.  A
> > +*post_load* function must be added that updates this field after loading 
> > the
> > +rest of the device state.  This function is implemented in the device's 
> > source
> > +file, not in the QC header.  Below is an example of what this function may 
> > do:
> > +
> > +    static void serial_post_load(SerialDevice *s)
> > +    {
> > +        s->int_pending = !(s->lsr & THRE) && (s->ier & INTE);
> > +    }
> > +
> > +When reviewing a patch that marks a field as *_derived*, the following 
> > criteria
> > +should be used:
> > +
> > + 1. Does the device have a post load function?
> > +
> > + 2. Does the post load function assign a value to all of the derived 
> > fields?
> > +
> > + 3. Are there any obvious places where a derived field is holding unique 
> > state?
> > +
> 
> <snip>
> 
> Suggestion: add a _guest marker for ordinary state.  Fail the build on
> unmarked fields.  This ensures that some thought is given to each field,
> instead of having a default that may be correct most of the time, but
> not always.

Hmm, I my general thought was that is doesn't hurt to send extra, which
made serialization a reasonable default course of action.

But there is indeed a risk of overwriting target state with garbage if
we don't verify what fields really should/shouldn't be sent. A marker to
track this does seem useful in that regard...

> 
> Suggestion: add a mandatory position hint (_guest(7) or _pos(7)) that
> generates ordering within the fields.  This decouples the order of lines
> in the struct from the protocol, so you can add state where it make
> sense, or rearrange lines when they don't, and detect copy/paste errors.
> 

I'm in agreement with Gerd that the wire protocol we use should support
field names. I think device state constitutes a small enough percentage
of total migrated state that the performance impact would be negligable,
and migration will invariably add some negotiation/compatibility
functionality on top of the serialization that would make having field
names intact useful for analyzing/debugging things.

I personally like the idea of using compressed json, but I think we
could implement a QObject<->BER mechanism that would provide this as
well.

> 
> > diff --git a/scripts/qc.py b/scripts/qc.py
> > new file mode 100755
> > index 0000000..74f2a40
> > --- /dev/null
> > +++ b/scripts/qc.py
> > @@ -0,0 +1,494 @@
> > +#!/usr/bin/python
> > +
> > +import sys
> > +from ordereddict import OrderedDict
> > +
> > +marker = "qc_declaration"
> > +marked = False
> > +
> > +class Input(object):
> > +    def __init__(self, fp):
> > +        self.fp = fp
> > +        self.buf = ''
> > +        self.eof = False
> > +
> > +    def pop(self):
> > +        if len(self.buf) == 0:
> > +            if self.eof:
> > +                return ''
> > +
> > +            data = self.fp.read(1024)
> > +            if data == '':
> > +                self.eof = True
> > +                return ''
> > +
> > +            self.buf += data
> > +
> > +        ch = self.buf[0]
> > +        self.buf = self.buf[1:]
> > +        return ch
> 
> 
> Could be simplified as fp.read(1).  Does the performance really suffer
> if you read a byte at a time?

Currently, for 100 annotated header files identical to
mc146818rtc_state.h, it's about 4 seconds for me, reading from ssd, writing to
stdout. With the change it's roughly 4.2s. So yah, seems like a nice
simplification.

> 
> > +
> > +def in_range(ch, start, end):
> > +    if ch >= start and ch <= end:
> > +        return True
> > +    return False
> > +
> > +# D                        [0-9]
> > +# L                        [a-zA-Z_]
> > +# H                        [a-fA-F0-9]
> > +# E                        [Ee][+-]?{D}+
> > +# FS                       (f|F|l|L)
> > +# IS                       (u|U|l|L)*
> > +
> > +def is_D(ch):
> > +    return in_range(ch, '0', '9')
> > +
> > +def is_L(ch):
> > +    return in_range(ch, 'a', 'z') or in_range(ch, 'A', 'Z') or ch == '_'
> > +
> > +def is_H(ch):
> > +    return in_range(ch, 'a', 'f') or in_range(ch, 'A', 'F') or is_D(ch)
> > +
> > +def is_FS(ch):
> > +    return ch in 'fFlL'
> > +
> > +def is_IS(ch):
> > +    return ch in 'uUlL'
> 
> 
> import re
> 
> D = re.compile(r'[0-9]')
> ...
> IS = re.compile(r'[uUlL]+')
> 
> <snip parser>
> 
> Surely there are available lexer/parser packages?

This seems promising:

http://pygments.org/docs/lexerdevelopment/

> 
> 
> -- 
> error compiling committee.c: too many arguments to function
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]