qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 15/18] qapi: Move duplicate member checks to


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v8 15/18] qapi: Move duplicate member checks to schema check()
Date: Tue, 13 Oct 2015 19:13:48 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Eric Blake <address@hidden> writes:

> On 10/13/2015 09:06 AM, Markus Armbruster wrote:
>> First reply: commit message review only.  Patch review to follow.
>> 
>> Eric Blake <address@hidden> writes:
>> 
>>> With the previous commit, we have two different locations for
>>> detecting member name clashes - one at parse time, and another
>>> at QAPISchema*.check() time.  Consolidate some of the checks
>>> into a single place, which is also in line with our TODO to
>>> eventually move all of the parse time semantic checking into
>>> the newer schema code.  The check_member_clash() function is
>>> no longer needed.
>>>
>>> Checking variants is tricky:
>> 
>> Indeed.
>> 
>> Struct types aren't as tricky, but tricky enough to warrant spelling
>> them out, so let me do that.
>
> Thanks. May be worth cribbing this information into the commit message,
> if there is a reason for a v9 respin.
>
>> 
>> QMP: The member names of the JSON object are exactly the member names of
>> the struct type.  Thus, members can clash with each other (d'oh!).
>> 
>> C: The member names of the C struct are exactly the C names of the *own*
>> (non-inherited) member names of the struct type, plus 'base' if it has a
>> base type, plus a has_FOO flag for each optional local member with C
>> name FOO.  Therefore, local members can clash with each other (d'oh
>> again!), and additionally with 'base' and the has_FOOs.
>> 
>> The additional clashes are self-inflicted pain:
>> 
>> * Less foolish names for the flags, i.e. ones that cannot occur as
>>   member names, would eliminate the has_FOO clashes.
>
> Or even eliminating flags: at least for 'str' and objects, we can use
> foo==NULL in place of has_foo==false.  I have plans to add a markup to
> various structs for when it is safe to eliminate the has_PTR bools, and
> then over a series of patches clean up the .json files to take advantage
> of it - but not until after the pending patches are flushed.

I've wanted to get rid of the "have flags" for pointers since basically
forever.

But even then we'll still have other "have flags", and we'll want less
foolish names for them.

>> 
>> * Unboxing base types like we do for unions would eliminate the 'base'
>>   clash.  Heck, even renaming base to something that cannot occur as
>>   member name would.
>
> My unflushed queue already includes this change, and I'm likely to
> promote it to the front of the queue after subset B goes in (along with
> the kind/type fixes).

Sounds good.

>> If we check for clashes with *inherited* member names, too, then
>> checking for C clashes suffices, because when the QMP member names
>> clash, the C member names clash, too.
>
> Another clash is when two QMP names map to the same C name, as in 'a-b'
> vs. 'a_b'.  Checking for C member name clashes rejects things that are
> not a QMP collision now, but also leaves the door open in case we later
> decide to make QMP key detection looser and allow '-' vs. '_' to be
> synonyms (if we allowed both spellings in the same struct now, then we
> prevent making them synonyms in the future for anyone that (ab)uses both
> names in the same struct).

Keeping the door open for cleaning up the '-' vs. '_' mess is a good
point.

>>>                              we need to check that the branch
>>> name will not cause a collision (important for C code, but
>>> no bearing on QMP).  Then, if the variant belongs to a union
>>> (flat or simple), we need to check that QMP members of that
>>> type will not collide with non-variant QMP members (but do
>>> not care if they collide with C branch names).
>> 
>> Union types.
>> 
>> QMP: The member names of the JSON object are exactly the names of the
>> union type's non-variant members (this includes the tag member; a simple
>> union's implicit tag is named 'type') plus the names of a single case's
>> variant members.  Branch names occurs only as (tag) value, not as key.
>> 
>> Thus, members can clash with each other, except for variant members from
>> different cases.
>> 
>> C: The member names of the C struct are
>> 
>> * the C names of the non-variant members (this includes the tag member;
>>   a simple union's implicit tag is named 'kind' now, soon 'type')
>> 
>> * a has_FOO for each optional non-variant member with C name FOO
>> 
>> * the branch names, wrapped in an unnamed union
>> 
>> * 'data', wrapped in the same unnamed union
>> 
>> Therefore, non-variant members can clash with each other as for struct
>> types (except here the inherited members *are* unboxed already), and
>> additionally
>> 
>> * branch names can clash with each other (but that's caught when
>>   checking the tag enumeration, which has the branch names as values)
>> 
>> * branch names can clash with non-variant members
>> 
>> * 'data' can clash with branch names and non-variant members
>> 
>> The additional clashes are all self-inflicted pain: either give the
>> union a name that cannot clash with a non-variant member, or unbox the
>> cases and rename or kill 'data'.
>
> Later patches kill 'data'.  I'm not convinced about unboxing cases:
> consider this qapi (with abbreviated notation):
>
> { 'union': 'Foo', 'base': { 'type': 'MyEnum' },
>   'discriminator': 'type',
>   'data': { 'b1': { 'm': 'int' },
>             'b2': { 'm': 'bool' } } }
>
> which, if partially unboxed, would result in:
>
> struct Foo {
>     MyEnum type;
>     union { /* tag is type */
>         struct {
>             int m;
>         } b1;
>         struct {
>             bool m;
>         } b2;
>     };
> };
>
> where the case names are still present,

This makes sense, I think.

>                                         or if completely unboxed via
> anonymous structs within the anonymous union, would present:
>
> struct Foo {
>     MyEnum type;
>     union { /* tag is type */
>         struct {
>             int m;
>         };
>         struct {
>             bool m;
>         };
>     };
> };
>
> Oops - by making m completely anonymous, we have a C collision.  But if
> we munge things, such as 'b1_m' or 'b2_m', we no longer can use C names
> to detect QMP collisions with the non-variant names.

You're right.

Calling the member b1_m when we could just as well call it b1.m feels
silly.

> I still haven't played with naming the union, but the more we discuss
> this, the more I like the idea of:
>
> struct Foo {
>     MyEnum type;
>     union { /* tag is type */
>         B1 b1;
>         B2 b2;
>     } u;
> };
>
> where the caller has to do foo.u.b1.m or foo.u.b2.m (instead of the
> current foo.b1->m or foo.b2->m).  That is, hiding the branch names
> behind the union named 'u' (of course, we'd have to forbid a QMP
> non-variant named 'u', or else go with a union named '_u') instantly
> solves the problem of branches colliding with QMP names, at the price of
> slightly more verbose C client code, and while still requiring effort to
> track QMP collisions between branch members and non-variant members.

I like the idea to avoid silly clashes by naming the union.  The actual
union name is debatable detail.

>> If we check for clashes between the non-variant members and each single
>> case's variant members, too, then checking for C clashes suffices,
>> because when the QMP member names clash, the C member names clash, too.
>> 
>>>                                                 Each call to
>>> variant.check() has a 'seen' that contains all [*] non-variant
>>> C names (which includes all non-variant QMP names), but does
>> 
>> What exactly do you mean by the parenthesis?
>
> I was trying to get at what you covered: that the set of C names
> includes things like 'has_*' that are not QMP names, but that all
> non-variant QMP names are at least included in the set of C names in the
> 'seen' dictionary.

Ah, okay.

>>> not add to that array (QMP members of one branch do not cause
>>> collisions with other branches).  This means that there is
>>> one form of collision we still miss: when two branch names
>>> collide.  However, that will be dealt with in the next patch.
>> 
>> Well, it's already dealt with.  The next patch merely moves the dealing
>> into the .check().
>> 
>>> [*] Exception - the 'seen' array doesn't contain 'base', which
>>> is currently a non-variant C member of structs; but since
>>> structs don't contain variants, it doesn't hurt. Besides, a
>>> later patch will be unboxing structs the way flat unions
>>> have already been unboxed.
>>>
>>> The wording of several error messages has changed, but in many
>>> cases feels like an improvement rather than a regression.  The
>>> recent change (commit 7b2a5c2) to avoid an assertion failure
>>> when a flat union branch name collides with its discriminator
>>> name is also handled nicely by this code; but there is more work
>>> needed before we can detect all collisions in simple union branch
>>> names done by the old code.
>> 
>> Only simple?
>
> Ugh, I guess my wording needs an update here, to match the fact that I
> updated the code to cover all unions.  I was trying to convey that this
> patch cannot revert everything added in 7b2a5c2, because the branch name
> collision detection added in that patch is not covered by this move to
> check().
>
>> 
>> I've come to the conclusion that we should get rid of the self-inflicted
>> pain before we attempt to detect all collisions.
>
> Then that sounds like I should try harder to get the kind/type naming,
> the boxed base naming, and even the anonymous union naming all hoisted
> into this subset, and spin a v9?

I can take PATCH 01-09,12 into my tree right away, with PATCH 07's two
redundant is_implicit() methods dropped, and PATCH 12's comment touched
up.

I could take PATCH 10, but let's at least try to make a plan for
c_name() first.  If we fail, I'll take the patch, perhaps less the % to
+ change, and we'll revisit c_name() later when we see more clearly.

You want to move PATCH 11 to later in the queue, and I like that.

PATCH 13 needs a fix squashed in, and a few nits touched up.  If you
want me to do that on commit, please propose a patch for me to squash
in.  But a respin is probably easier for all.

PATCH 14 is fine, but it depends on 13.

I haven't finished review of PATCH 15-18.

Taken together, I think the easiest way forward is I take 01-09,12, and
you respin the rest after we finish its review.  Makes sense?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]