qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC v2 21/47] qapi: New QAPISchema intermediate


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH RFC v2 21/47] qapi: New QAPISchema intermediate reperesentation
Date: Mon, 27 Jul 2015 11:23:19 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Eric Blake <address@hidden> writes:

> On 07/01/2015 02:22 PM, Markus Armbruster wrote:
>> The QAPI code generators work with a syntax tree (nested dictionaries)
>> plus a few symbol tables (also dictionaties) on the side.
>
> s/dictionaties/dictionaries/

Will fix.

>> They have clearly outgrown these simple data structures.  There's lots
>> of rummaging around in dictionaries, and information is recomputed on
>> the fly.  For the work I'm going to do, I want more clearly defined
>> and more convenient interfaces.
>> 
>> Going forward, I also want less coupling between the back-ends and the
>> syntax tree, to make messing with the syntax easier.
>
> Indeed - should be a lot easier to add new qapi .json syntax sugar on
> the front-end, or to tweak generated C layout on the backend, without
> close coupling between the two.  Particularly when adding front-end
> sugar extensions that still normalize into the same backend structs.
>
>> 
>> Create a bunch of classes to represent QAPI schemata.
>> 
>> Have the QAPISchema initializer call the parser, then walk the syntax
>> tree to create the new internal representation, and finally perform
>> semantic analysis.
>> 
>> Shortcut: the semantic analysis still relies on existing check_exprs()
>> to do the actual semantic checking.  All this code needs to move into
>> the classes.  Mark as TODO.
>> 
>> We generate array types eagerly, even though most of them aren't used.
>> Mark as TODO.
>> 
>
> No change to generated files at this stage in the game (this is mostly
> additive, then later patches shed their old ways of doing things by
> using this).  Good division of labor between patches.

I'll steal from this paragraph for my commit message.

>> Signed-off-by: Markus Armbruster <address@hidden>
>> ---
>>  scripts/qapi-commands.py       |   2 +-
>>  scripts/qapi-event.py          |   2 +-
>>  scripts/qapi-types.py          |   2 +-
>>  scripts/qapi-visit.py          |   2 +-
>>  scripts/qapi.py                | 351 
>> +++++++++++++++++++++++++++++++++++++++--
>>  tests/qapi-schema/test-qapi.py |   2 +-
>>  6 files changed, 347 insertions(+), 14 deletions(-)
>> 
>
>> +++ b/scripts/qapi.py
>
>> +
>> +#
>> +# Schema compiler frontend
>> +#
>> +
>> +class QAPISchemaEntity(object):
>> +    def __init__(self, name, info):
>> +        assert isinstance(name, str)
>> +        self.name = name
>> +        self.info = info
>> +    def check(self, schema):
>> +        pass
>> +
>> +class QAPISchemaType(QAPISchemaEntity):
>> +    pass
>> +
>> +class QAPISchemaBuiltinType(QAPISchemaType):
>> +    def __init__(self, name):
>> +        QAPISchemaType.__init__(self, name, None)
>> +
>> +class QAPISchemaEnumType(QAPISchemaType):
>> +    def __init__(self, name, info, values):
>> +        QAPISchemaType.__init__(self, name, info)
>> +        for v in values:
>> +            assert isinstance(v, str)
>> +        self.values = values
>
> worth a check() method to ensure values don't collide?  Especially since
> you already do that in QAPISchemaObjectTypeMember.check()

Can do.

Aside: everything the check() methods assert must be established by
semantic analysis.  Moving semantic analysis into the classes as
outlined in the commit message will hopefully make that obvious enough
to permit dropping the assertions.

>> +
>> +class QAPISchemaArrayType(QAPISchemaType):
>> +    def __init__(self, name, info, element_type):
>> +        QAPISchemaType.__init__(self, name, info)
>> +        assert isinstance(element_type, str)
>> +        self.element_type_name = element_type
>> +        self.element_type = None
>> +    def check(self, schema):
>> +        self.element_type = schema.lookup_type(self.element_type_name)
>> +        assert self.element_type
>> +
>> +class QAPISchemaObjectType(QAPISchemaType):
>> +    def __init__(self, name, info, base, local_members, variants):
>> +        QAPISchemaType.__init__(self, name, info)
>> +        assert base == None or isinstance(base, str)
>> +        for m in local_members:
>> +            assert isinstance(m, QAPISchemaObjectTypeMember)
>> +        if variants != None:
>> +            assert isinstance(variants, QAPISchemaObjectTypeVariants)
>> +        self.base_name = base
>> +        self.base = None
>> +        self.local_members = local_members
>> +        self.variants = variants
>> +        self.members = None
>> +    def check(self, schema):
>> +        if self.members:
>> +            return                      # already checked
>> +        assert self.members == None     # not running in cycles
>> +        self.members = False            # mark as being checked
>
> Interesting, but makes sense (since we allow self-referential nesting,
> populating our own members may require revisiting the same type).

Yes.  Another way to put it: for a recursive type like this, we better
check the recursion terminates.

>                                                                    Cute
> that python allows both None and False as distinct non-true settings, so
> that you end up using the variable as a tri-state.

Yup :)

>> +        if self.base_name:
>> +            self.base = schema.lookup_type(self.base_name)
>> +            self.base.check(schema)
>
> Do you need 'assert self.base' here, similar to how you did it in
> QAPISchemaArrayType.check()?  I guess you'll still get a python
> exception that None.check() doesn't exist if the lookup_type failed, so
> it just depends on what error message you want.

I don't see a hard need, but consistency is nice.  Since I generally
assert elsewhere, I'll add an assertion here.

>> +            members = list(self.base.members)
>
> For that matter, do you want to assert isinstance(self.base,
> QAPISchemaObjectType), since lookup_type can return other types, but
> other child classes of QAPISchemaType do not have a .members?

Yes, that's the appropriate assertion.

>> +        else:
>> +            members = []
>> +        seen = {}
>> +        for m in members:
>> +            seen[m.name] = m
>> +        for m in self.local_members:
>> +            m.check(schema, members, seen)
>> +        if self.variants:
>> +            self.variants.check(schema, members, seen)
>> +        self.members = members
>> +
>> +class QAPISchemaObjectTypeMember(object):
>
> Is 'object' the right base class, or should it be QAPISchemaEntity?

No, it's really object.  A schema's entities can be object types, other
types, commands and events, but not members.  A schema doesn't have
"members", only an object type has.

>> +    def __init__(self, name, typ, optional):
>> +        assert isinstance(name, str)
>> +        assert isinstance(typ, str)
>
> I guess that means all callers flatten array types into an unambiguous
> string first.

Yes.

Conscious design decision: every type in the internal representation has
a unique name.  If the concrete syntax lets you define an anonymous
type, we make up a name in the internal representation.  One less
special case to worry about.

>> +        assert isinstance(optional, bool)
>> +        self.name = name
>> +        self.type_name = typ
>> +        self.type = None
>> +        self.optional = optional
>> +    def check(self, schema, all_members, seen):
>> +        assert self.name not in seen
>> +        self.type = schema.lookup_type(self.type_name)
>> +        assert self.type
>> +        all_members.append(self)
>> +        seen[self.name] = self
>> +
>> +class QAPISchemaObjectTypeVariants(object):
>
> Is 'object' the right base class, or should it be QAPISchemaEntity?

As for QAPISchemaObjectTypeMember above, object is appropriate.

>> +    def __init__(self, tag_name, tag_enum, variants):
>> +        assert tag_name == None or isinstance(tag_name, str)
>> +        assert tag_enum == None or isinstance(tag_enum, str)
>> +        for v in variants:
>> +            assert isinstance(v, QAPISchemaObjectTypeVariant)
>> +        self.tag_name = tag_name
>> +        if tag_name:
>> +            assert not tag_enum
>> +            self.tag_member = None
>> +        else:
>> +            self.tag_member = QAPISchemaObjectTypeMember('kind', tag_enum,
>> +                                                         False)
>
> Nice - the tag is either part of the base struct (flat union) or
> implicitly created (plain union).  By the time we are done with check(),
> we then have both its type (possibly generated implicitly) and its name.

... and we don't have to know where they came from anymore.

>> +        self.variants = variants
>> +    def check(self, schema, members, seen):
>> +        if self.tag_name:
>> +            self.tag_member = seen[self.tag_name]
>> +        else:
>> +            self.tag_member.check(schema, members, seen)
>
> A little bit tricky here in that flat unions (tag_name was set) must
> find the tag from the members already loaded from the base class, while
> plain unions (tag_name is None) add their own implicit member into the
> union.  But it works.
>
>> +        assert isinstance(self.tag_member.type, QAPISchemaEnumType)
>> +        for v in self.variants:
>> +            vseen = dict(seen)
>> +            v.check(schema, self.tag_member.type, vseen)
>> +
>> +class QAPISchemaObjectTypeVariant(QAPISchemaObjectTypeMember):
>> +    def __init__(self, name, typ, flat):
>> +        QAPISchemaObjectTypeMember.__init__(self, name, typ, False)
>> +        assert isinstance(flat, bool)
>> +        self.flat = flat
>> +    def check(self, schema, tag_type, seen):
>> +        QAPISchemaObjectTypeMember.check(self, schema, [], seen)
>> +        assert self.name in tag_type.values
>> +        if self.flat:
>> +            self.type.check(schema)
>> +            assert isinstance(self.type, QAPISchemaObjectType)
>
> Do we really want to be tracking self.flat for each variant?  After all,
> docs/qapi-code-gen.txt already describes the mapping from simple union
> into flat union (as if the flat union had a base class with single
> member 'kind' of the right type, then each branch of the union composed
> of an implicit object with a lone member 'data' of the correct type).
> In other words, is it any better to just normalize into that form now,
> such that each QAPISchemaObjectTypeVariant is merely a (often
> one-element) list of name:type members being added to the overall
> QAPISchemaObject?

I tried to do exactly that, but got bogged down in special cases and
copped out.  Then I went on vacation, and now I don't remember the exact
problems anymore %-}

I guess / hope it's just relatively pointless differences in the
generated C code I didn't want to get rid of at this time.  The series
is long and hairy enough as it is...

>                    But I guess it remains to be seen how you use
> self.flat before knowing if it is worth normalizing away from it.

At least introspect.json is oblivious of it.

>> +
>> +class QAPISchemaAlternateType(QAPISchemaType):
>> +    def __init__(self, name, info, variants):
>> +        QAPISchemaType.__init__(self, name, info)
>> +        assert isinstance(variants, QAPISchemaObjectTypeVariants)
>> +        assert not variants.tag_name
>> +        for v in variants.variants:
>> +            assert not v.flat
>
> Hmm. You are using .flat after all.
>
>> +        self.variants = variants
>> +    def check(self, schema):
>> +        self.variants.check(schema, [], {})
>> +
>> +class QAPISchemaCommand(QAPISchemaEntity):
>> +    def __init__(self, name, info, args, rets, gen, success_response):
>> +        QAPISchemaEntity.__init__(self, name, info)
>> +        assert not args or isinstance(args, str)
>> +        assert not rets or isinstance(rets, str)
>
> Again, this means that the caller has already had to convert qapi
> dictionaries into an implicit type and pass in that type name here.
> Works for me.
>
>> +        self.args_name = args
>> +        self.args = None
>> +        self.rets_name = rets
>> +        self.rets = None
>> +        self.gen = gen
>> +        self.success_response = success_response
>> +    def check(self, schema):
>> +        if self.args_name:
>> +            self.args = schema.lookup_type(self.args_name)
>> +            assert isinstance(self.args, QAPISchemaObjectType)
>> +            assert not self.args.variants # not implemented
>> +            self.args.check(schema)
>> +        if self.rets_name:
>> +            self.rets = schema.lookup_type(self.rets_name)
>> +            assert isinstance(self.rets, QAPISchemaType)
>> +            self.rets.check(schema)
>
> Matches our existing difference in requiring a dictionary for arguments,
> but allowing the return of any type (although we now have a whitelist to
> prevent too many additions of non-dictionaries).

Yes.  In my opinion, the difference between arguments and returns is a
design blemish (a function maps from domain to codomain, and both are
just sets, so any differences are likely pointless baggage), but we need
to work with what we've got.

>> +
>> +class QAPISchemaEvent(QAPISchemaEntity):
>> +    def __init__(self, name, info, data):
>> +        QAPISchemaEntity.__init__(self, name, info)
>> +        assert not data or isinstance(data, str)
>> +        self.data_name = data
>> +        self.data = None
>> +    def check(self, schema):
>> +        if self.data_name:
>> +            self.data = schema.lookup_type(self.data_name)
>> +            assert isinstance(self.data, QAPISchemaObjectType)
>> +            self.data.check(schema)
>> +
>> +class QAPISchema(object):
>> +    def __init__(self, fname):
>> +        try:
>> + self.exprs = check_exprs(QAPISchemaParser(open(fname, "r")).exprs)
>> +        except (QAPISchemaError, QAPIExprError), err:
>> +            print >>sys.stderr, err
>> +            exit(1)
>> +        self.entity_dict = {}
>> +        self._def_predefineds()
>> +        self._def_exprs()
>> +        self.check()
>> +
>> +    def get_exprs(self):
>> +        return [expr_elem['expr'] for expr_elem in self.exprs]
>> +
>> +    def _def_entity(self, ent):
>> +        assert ent.name not in self.entity_dict
>> +        self.entity_dict[ent.name] = ent
>> +
>> +    def lookup_entity(self, name, typ=None):
>> +        ent = self.entity_dict.get(name)
>> +        if typ and not isinstance(ent, typ):
>> +            return None
>> +        return ent
>
> Ah, so you enforce a shared namespace between types, commands, and
> events (all three are in self.entity_dict), but can use the typ
> parameter to allow limiting a lookup to just types.

Yes.  It's a convenience feature.

>> +
>> +    def lookup_type(self, name):
>> +        return self.lookup_entity(name, QAPISchemaType)
>> +
>> +    def _def_builtin_type(self, name):
>> +        self._def_entity(QAPISchemaBuiltinType(name))
>> +        if name != '**':
>> +            self._make_array_type(name) # TODO really needed?
>> +
>> +    def _def_predefineds(self):
>> +        for t in ['str', 'number', 'int', 'int8', 'int16', 'int32', 'int64',
>> + 'uint8', 'uint16', 'uint32', 'uint64', 'size', 'bool', '**']:
>> +            self._def_builtin_type(t)
>> +
>> +    def _make_implicit_enum_type(self, name, values):
>> +        name = name + 'Kind'
>> +        self._def_entity(QAPISchemaEnumType(name, None, values))
>> +        return name
>> +
>> +    def _make_array_type(self, element_type):
>> +        name = element_type + 'List'
>> +        if not self.lookup_type(name):
>> +            self._def_entity(QAPISchemaArrayType(name, None, element_type))
>> +        return name
>> +
>> +    def _make_implicit_object_type(self, name, role, members):
>> +        name = ':obj-%s-%s' % (name, role)
>> +        self._def_entity(QAPISchemaObjectType(name, None, None, members, 
>> None))
>> +        return name
>
> We may extend the qapi .json syntax to allow anonymous types in more
> places, but so far, this matches that all existing use of anonymous
> types are for structs, not unions.

Yes.

If we unify struct and union in the concrete syntax like we do in this
internal representation, then anonymous unions (really: anonymous
objects with variants) may well fall out naturally.

>> +
>> +    def _def_enum_type(self, expr, info):
>> +        name = expr['enum']
>> +        data = expr['data']
>> +        self._def_entity(QAPISchemaEnumType(name, info, data))
>> +        self._make_array_type(name) # TODO really needed?
>> +
>> +    def _make_member(self, name, typ):
>> +        optional = False
>> +        if name.startswith('*'):
>> +            name = name[1:]
>> +            optional = True
>> +        if isinstance(typ, list):
>> +            assert len(typ) == 1
>> +            typ = self._make_array_type(typ[0])
>> +        return QAPISchemaObjectTypeMember(name, typ, optional)
>> +
>> +    def _make_members(self, data):
>> +        return [self._make_member(key, data[key]) for key in data.keys()]
>> +
>> +    def _def_struct_type(self, expr, info):
>> +        name = expr['struct']
>> +        base = expr.get('base')
>> +        data = expr['data']
>> +        self._def_entity(QAPISchemaObjectType(name, info, base,
>> +                                              self._make_members(data),
>> +                                              None))
>> +        self._make_array_type(name) # TODO really needed?
>> +
>> +    def _make_flat_variant(self, case, typ):
>> +        return QAPISchemaObjectTypeVariant(case, typ, True)
>> +
>> +    def _make_flat_variants(self, tag_name, data):
>> +        variants = [self._make_flat_variant(key, data[key])
>> +                    for key in data.keys()]
>> +        return QAPISchemaObjectTypeVariants(tag_name, None, variants)
>> +
>> +    def _make_simple_variant(self, case, typ):
>> +        if isinstance(typ, list):
>> +            assert len(typ) == 1
>> +            typ = self._make_array_type(typ[0])
>> +        return QAPISchemaObjectTypeVariant(case, typ, False)
>> +
>> +    def _make_simple_variants(self, type_name, data):
>> +        variants = [self._make_simple_variant(key, data[key])
>> +                    for key in data.keys()]
>> +        enum = self._make_implicit_enum_type(type_name,
>> +                                             [v.name for v in variants])
>> +        return QAPISchemaObjectTypeVariants(None, enum, variants)
>
> Again, I wonder if normalizing simple unions into flat unions (to get
> rid of the need for tracking .flat) would make later use any easier, or
> if it would just make introspection QMP output more verbose.

They *are* normalized in introspection output!

>> +
>> +    def _def_union_type(self, expr, info):
>> +        name = expr['union']
>> +        data = expr['data']
>> +        tag_name = expr.get('discriminator')
>> +        if tag_name:
>> +            base = expr['base']
>> +            variants = self._make_flat_variants(tag_name, data)
>> +        else:
>> +            base = None
>> +            variants = self._make_simple_variants(name, data)
>> +        self._def_entity(QAPISchemaObjectType(name, info, base,
>> + self._make_members(OrderedDict()),
>> +                                              variants))
>> +        self._make_array_type(name) # TODO really needed?
>> +
>> +    def _def_alternate_type(self, expr, info):
>> +        name = expr['alternate']
>> +        data = expr['data']
>> +        self._def_entity(QAPISchemaAlternateType(name, info,
>> +                                    self._make_simple_variants(name, data)))
>> +        self._make_array_type(name) # TODO really needed?
>> +
>> +    def _def_command(self, expr, info):
>> +        name = expr['command']
>> +        args = expr.get('data')
>> +        rets = expr.get('returns')
>> +        gen = expr.get('gen', True)
>> +        success_response = expr.get('success-response', True)
>> +        if args and not isinstance(args, str):
>> +            args = self._make_implicit_object_type(name, 'args',
>> +                                                   self._make_members(args))
>
> If I write { 'command':'Foo', 'data':{} }, does that try and create an
> implicit type, with no members?

tests/qapi-schema/qapi-schema-test.json tests this:

    { 'command': 'user_def_cmd', 'data': {} }

args = expr.get('data') gets an empty OrderedDict().

Since an empty dictionary counts as false, we skip calling
_make_implicit_object_type(), and pass the empty dictionary to
QAPISchemaCommand().  Works, because the class consistently treats any
false args the same as None.  tests/qapi-schema/qapi-schema-test.out:

    command user_def_cmd None -> None
       gen=True success_response=True

I'll consider cleaning this up a bit.

>                                  I'm wondering if it is any simpler to
> write args = expr.get('data', {}), and have _make_implicit_object_type()
> do the right thing when presented with an empty list of members.
>
>> +        if rets and isinstance(rets, list):
>> +            assert len(rets) == 1
>> +            rets = self._make_array_type(rets[0])
>> +        elif rets and not isinstance(rets, str):
>> +            rets = self._make_implicit_object_type(name, 'rets',
>> +                                                   self._make_members(rets))
>> +        self._def_entity(QAPISchemaCommand(name, info, args, rets, gen,
>> +                                           success_response))
>> +
>> +    def _def_event(self, expr, info):
>> +        name = expr['event']
>> +        data = expr.get('data')
>> +        if data and not isinstance(data, str):
>> +            data = self._make_implicit_object_type(name, 'data',
>> +                                                   self._make_members(data))
>> +        self._def_entity(QAPISchemaEvent(name, info, data))
>> +
>> +    def _def_exprs(self):
>> +        for expr_elem in self.exprs:
>> +            expr = expr_elem['expr']
>> +            info = expr_elem['info']
>> +            if 'enum' in expr:
>> +                self._def_enum_type(expr, info)
>> +            elif 'struct' in expr:
>> +                self._def_struct_type(expr, info)
>> +            elif 'union' in expr:
>> +                self._def_union_type(expr, info)
>> +            elif 'alternate' in expr:
>> +                self._def_alternate_type(expr, info)
>> +            elif 'command' in expr:
>> +                self._def_command(expr, info)
>> +            elif 'event' in expr:
>> +                self._def_event(expr, info)
>> +            else:
>> +                assert False
>> +
>> +    def check(self):
>> +        for ent in self.entity_dict.values():
>> +            ent.check(self)
>
> Overall seems like a very nice layout.
>
> I'll reserve R-b until I see how the rest of the series uses this object
> hierarchy, but it looks like you won't have to change very much of this
> patch when you do your next spin of the series.

Thanks!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]