qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 13/17] qapi: add qapi2texi script


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v5 13/17] qapi: add qapi2texi script
Date: Tue, 06 Dec 2016 12:50:04 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

I had to resort to diff to find your replies, and massage the text
manually to produce a readable reply myself.  Please quote the usual
way.

> Markus Armbruster <address@hidden> writes:
> 
> > Marc-André Lureau <address@hidden> writes:
> >
> >> As the name suggests, the qapi2texi script converts JSON QAPI
> >> description into a texi file suitable for different target
> >> formats (info/man/txt/pdf/html...).
> >>
> >> It parses the following kind of blocks:
> >>
> >> Free-form:
> >>
> >>   ##
> >>   # = Section
> >>   # == Subsection
> >>   #
> >>   # Some text foo with *emphasis*
> >>   # 1. with a list
> >>   # 2. like that
> >>   #
> >>   # And some code:
> >>   # | $ echo foo
> >>   # | -> do this
> >>   # | <- get that
> >>   #
> >>   ##
> >>
> >> Symbol:
> >>
> >>   ##
> >>   # @symbol:
> >>   #
> >>   # Symbol body ditto ergo sum. Foo bar
> >>   # baz ding.
> >>   #
> >>   # @arg: foo
> >>   # @arg: #optional foo
> >
> > Let's not use @arg twice.
> >
> > Terminology: I prefer to use "parameter" for formal parameters, and
> > "argument" for actual arguments.  This matches how The Python Language
> > Reference uses the terms.
> >
> > What about
> >
> >     # @param1: the frob to frobnicate
> >     # @param2: #optional how hard to frobnicate
> 
> ok
> 
> >>   #
> >>   # Returns: returns bla bla
> >>   #          Or bla blah
> >
> > Repeating "returns" is awkward, and we don't do that in our schemas.
> >
> > We need a period before "Or", or spell it "or".
> >
> > What about
> >
> >     # Returns: the frobnicated frob.
> >     #          If frob isn't frobnicatable, GenericError.
> 
> ok
> 
> >>   #
> >>   # Since: version
> >>   # Notes: notes, comments can have
> >>   #        - itemized list
> >>   #        - like this
> >>   #
> >>   # Example:
> >>   #
> >>   # -> { "execute": "quit" }
> >>   # <- { "return": {} }
> >>   #
> >>   ##
> >>
> >> That's roughly following the following EBNF grammar:
> >>
> >> api_comment = "##\n" comment "##\n"
> >> comment = freeform_comment | symbol_comment
> >> freeform_comment = { "# " text "\n" | "#\n" }
> >> symbol_comment = "# @" name ":\n" { member | meta | freeform_comment }
> >
> > Rejects non-empty comments where "#" is not followed by space.  Such
> > usage is accepted outside doc comments.  Hmm.
> >
> >> member = "# @" name ':' [ text ] freeform_comment
> >
> > Are you missing a "\n" before freeform_comment?
> 
> yes
> 
> >> meta = "# " ( "Returns:", "Since:", "Note:", "Notes:", "Example:", 
> >> "Examples:" ) [ text ] freeform_comment
> >
> > Likewise.
> 
> ok
> 
> >> text = free-text markdown-like, "#optional" for members
> >
> > The grammar is ambiguous: a line "# @foo:\n" can be parsed both as
> > freeform_comment and as symbol_comment.  Since your intent is obvious
> > enough, it can still serve as documentation.  It's not a suitable
> > foundation for parsing, though.  Okay for a commit message.
> >
> >> Thanks to the following json expressions, the documentation is enhanced
> >> with extra information about the type of arguments and return value
> >> expected.
> >
> > I guess you want to say that we enrich the documentation we extract from
> > comments with information from the actual schema.  Correct?
> 
> yes
> 
> > Missing: a brief discussion of deficiencies.  These include:
> >
> > * The generated QMP documentation includes internal types
> >
> >   We use qapi-schema.json both for defining the external QMP interface
> >   and for defining internal types.  qmp-introspect.py carefully
> >   separates the two, to not expose internal types.  qapi2texi.py happily
> >   exposes everything.
> >
> >   Currently, about a fifth of the types in the generated docs are
> >   internal:
> >
> >       AcpiTableOptions
> >       BiosAtaTranslation
> >       BlockDeviceMapEntry
> >       COLOMessage
> >       COLOMode
> >       DummyForceArrays
> >       FailoverStatus
> >       FloppyDriveType
> >       ImageCheck
> >       LostTickPolicy
> >       MapEntry
> >       MigrationParameter
> >       NetClientDriver
> >       NetFilterDirection
> >       NetLegacy
> >       NetLegacyNicOptions
> >       NetLegacyOptions
> >       NetLegacyOptionsKind
> >       Netdev
> >       NetdevBridgeOptions
> >       NetdevDumpOptions
> >       NetdevHubPortOptions
> >       NetdevL2TPv3Options
> >       NetdevNetmapOptions
> >       NetdevNoneOptions
> >       NetdevSocketOptions
> >       NetdevTapOptions
> >       NetdevUserOptions
> >       NetdevVdeOptions
> >       NetdevVhostUserOptions
> >       NumaNodeOptions
> >       NumaOptions
> >       NumaOptionsKind
> >       OnOffAuto
> >       OnOffSplit
> >       PreallocMode
> >       QCryptoBlockCreateOptions
> >       QCryptoBlockCreateOptionsLUKS
> >       QCryptoBlockFormat
> >       QCryptoBlockInfo
> >       QCryptoBlockInfoBase
> >       QCryptoBlockInfoQCow
> >       QCryptoBlockOpenOptions
> >       QCryptoBlockOptionsBase
> >       QCryptoBlockOptionsLUKS
> >       QCryptoBlockOptionsQCow
> >       QCryptoSecretFormat
> >       QCryptoTLSCredsEndpoint
> >       QapiErrorClass
> >       ReplayMode
> >       X86CPUFeatureWordInfo
> >       X86CPURegister32
> >
> >   Generating documentation for internal types might be useful, but
> >   letting them pollute QMP interface documentation isn't.  Needs fixing
> >   before we release.  Until then, needs a FIXME comment in qapi2texi.py.
> >
> > * Union support is lacking
> >
> >   The doc string language is adequate for documenting commands, events,
> >   and non-union types.  It doesn't really handle union variants.  Hardly
> >   surprising, as you fitted the language do existing practice, and
> >   existing (mal-)practice is neglecting to document union variant
> >   members.
> >
> > * Documentation is lacking
> >
> >   See review of qapi-code-gen.txt below.
> >
> > * Doc comment error message positions are imprecise
> >
> >   They always point to the beginning of the comment.
> >
> > * Probably more
> >
> >   We should update this with noteworthy findings during review.  I
> >   tried, but I suspect I missed some.
> 
> ok
> 
> >> Signed-off-by: Marc-André Lureau <address@hidden>
> > [Lengthy diffstat snipped...]
> >>
> >> diff --git a/tests/Makefile.include b/tests/Makefile.include
> >> index e98d3b6..f16764c 100644
> >> --- a/tests/Makefile.include
> >> +++ b/tests/Makefile.include
> >> @@ -350,6 +350,21 @@ qapi-schema += base-cycle-direct.json
> >>  qapi-schema += base-cycle-indirect.json
> >>  qapi-schema += command-int.json
> >>  qapi-schema += comments.json
> >> +qapi-schema += doc-bad-args.json
> >> +qapi-schema += doc-bad-symbol.json
> >> +qapi-schema += doc-duplicated-arg.json
> >> +qapi-schema += doc-duplicated-return.json
> >> +qapi-schema += doc-duplicated-since.json
> >> +qapi-schema += doc-empty-arg.json
> >> +qapi-schema += doc-empty-section.json
> >> +qapi-schema += doc-empty-symbol.json
> >> +qapi-schema += doc-invalid-end.json
> >> +qapi-schema += doc-invalid-end2.json
> >> +qapi-schema += doc-invalid-return.json
> >> +qapi-schema += doc-invalid-section.json
> >> +qapi-schema += doc-invalid-start.json
> >> +qapi-schema += doc-missing-expr.json
> >> +qapi-schema += doc-missing-space.json
> >>  qapi-schema += double-data.json
> >>  qapi-schema += double-type.json
> >>  qapi-schema += duplicate-key.json
> >> @@ -443,6 +458,8 @@ qapi-schema += union-optional-branch.json
> >>  qapi-schema += union-unknown.json
> >>  qapi-schema += unknown-escape.json
> >>  qapi-schema += unknown-expr-key.json
> >> +
> >> +
> >>  check-qapi-schema-y := $(addprefix tests/qapi-schema/, $(qapi-schema))
> >>  
> >>  GENERATED_HEADERS += tests/test-qapi-types.h tests/test-qapi-visit.h \
> >> diff --git a/scripts/qapi.py b/scripts/qapi.py
> >> index 4d1b0e4..1b456b4 100644
> >> --- a/scripts/qapi.py
> >> +++ b/scripts/qapi.py
> >> @@ -122,6 +122,109 @@ class QAPILineError(Exception):
> >>              "%s:%d: %s" % (self.info['file'], self.info['line'], self.msg)
> >>  
> >>  
> >> +class QAPIDoc(object):
> >> +    class Section(object):
> >> +        def __init__(self, name=""):
> >
> > name=None feels more natural than "" for an absent optional name.
> 
> ok
> 
> >> +            # optional section name (argument/member or section name)
> >> +            self.name = name
> >> +            # the list of strings for this section
> >
> > Would "list of lines" be more accurate, or less?
> 
> more accurate, ok
> 
> >> +            self.content = []
> >> +
> >> +        def append(self, line):
> >> +            self.content.append(line)
> >> +
> >> +        def __repr__(self):
> >> +            return "\n".join(self.content).strip()
> >> +
> >> +    class ArgSection(Section):
> >> +        pass
> >
> > Begs the question what makes ArgSection differ from Section.  I think
> > it's the semantics of self.name: an ArgSection's name is the parameter
> > name, a non-ArgSection can either be a "meta" section (name is one of
> > "Returns"", "Since", "Note", "Notes", "Example", "Examples") or an
> > anonymous section (name is "").
> 
> yes, and the section type is checked in  _append_freeform()
> 
> >> +
> >> +    def __init__(self, parser):
> >> +        self.parser = parser
> >> +        self.symbol = None
> >> +        self.body = QAPIDoc.Section()
> >> +        # a dict {'arg': ArgSection, ...}
> >
> > Works.  Alternatevely:
> >
> >            # dict mapping parameter name to ArgSection
> 
> ok
> 
> >> +        self.args = OrderedDict()
> >> +        # a list of Section
> >> +        self.meta = []
> >> +        # the current section
> >> +        self.section = self.body
> >> +        # associated expression and info (set by expression parser)
> >
> > "will be set by expression parser", or "to be set by expression parser"?
> 
> ok
> 
> >> +        self.expr = None
> >> +        self.info = None
> >
> > Clearer now.  The one remark I still have is on the use of "meta" isn't
> > obvious here.  I think a comment further up explaining the different
> > kinds of sections would help.
> >
> > To be honest, I don't get what's "meta" about the "meta" sections :)
> 
> Let's drop that distinction, it isn't necessary anymore
> 
> >> +
> >> +    def has_meta(self, name):
> >> +        """Returns True if the doc has a meta section 'name'"""
> >
> > PEP257[1] demands imperative mood, and punctuation:
> >
> >            """Return True if the doc has a meta section 'name'."""
> >
> > Unfortunately, there seems to be no established convention for marking
> > parameter names.  You put this one in single quotes, which works here,
> > but could be confusing in other cases.  I'd sidestep it:
> >
> >            """Return True if we have a meta section with this name."""
> >
> > or
> >
> >            """Does a meta section with this name exist?"""
> 
> ok
> 
> >> +        for i in self.meta:
> >> +            if i.name == name:
> >> +                return True
> >> +        return False
> >> +
> >> +    def append(self, line):
> >> +        """Adds a # comment line, to be parsed and added to current 
> >> section"""
> >
> > Imperative mood:
> >
> >     """Add a comment line, to be parsed and added to the current section."""
> >
> > However, we're not always adding to the current section, we can also
> > start a new one.  The following avoids suggesting anything about the
> > current section:
> >
> >     """Parse a comment line and add it to the documentation."""
> >
> > When the function name is a verb, and its doc string starts with a
> > different verb, the name might be suboptimal.  Use your judgement.
> >
> > If this one-liner is too terse, we can try a multi-line doc string:
> >     
> >     """Parse a comment line and add it to the documentation.
> >
> >     TODO should we tell more about how we parse?
> >
> >     Args:
> >         line: the comment starting with '#', with the newline stripped.
> >
> >     Raises:
> >         QAPISchemaError: TODO explain error conditions
> >     """
> >
> > Format stolen from the Google Python Style Guide[2], because PEP257 is
> > of not much help there.
> >
> > Once you start with such elaborate doc strings, pressure will likely
> > mount to do the same elsewhere.  Could be a distraction right now.  Use
> > your judgement.
> 
> ok, thanks
> 
> >> +        line = line[1:]
> >> +        if not line:
> >> +            self._append_freeform(line)
> >> +            return
> >> +
> >> +        if line[0] != ' ':
> >> +            raise QAPISchemaError(self.parser, "missing space after #")
> >> +        line = line[1:]
> >
> > QAPISchemaError takes the error position from its QAPISchemaParser
> > argument.  In other words, the error position depends on the state of
> > the parser when it calls this function.  Turns out the position is the
> > beginning of the comment.  Servicable here.  Action at a distance,
> > though.
> >
> > Less strict:
> >
> >            # strip leading # and space
> >            line = line[1:]
> >            if line[0] == ' ':
> >                line = line[1:]
> >
> > Also avoids special-casing empty comments.
> 
> That would raise "IndexError: string index out of range" on empty comment
> lines ("#")

Fixable:
               if line.startswith(' '):

> Also I'd like to keep the error in case a space is missing (it caught some
> already).

On the one hand, I share your distaste for lack of space between # and
comment text.  On the other hand, I dislike enforcing spacing
inconsistently: enforce in doc comments, but not in other comments.

> > Perhaps we should treat all leading whitespace the same (I'm not sure):
> >
> >            # strip leading # and whitespace
> >            line = line[1:]
> >            line = line.lstrip()
> 
> That would break indentation in Example sections

True; scratch the idea.

> >> +
> >> +        if self.symbol:
> >> +            self._append_symbol_line(line)
> >> +        elif (not self.body.content and
> >> +              line.startswith("@") and line.endswith(":")):
> >> +            self.symbol = line[1:-1]
> >> +            if not self.symbol:
> >> +                raise QAPISchemaError(self.parser, "Invalid symbol")
> >
> > Doesn't recognize the symbol when there's anything other than a single
> > space between # and @.  Pathological case: 'address@hidden:' starting in 
> > column
> > 6 looks exactly like '# @foo:', but doesn't work.  Fortunately, your
> > code rejects the tab there.  Assigning meaning to (leading) whitespace
> > may make sense, but it must be documented clearly.
> 
> I think it's simpler to only accept a simple space (# + space + @ + symbol
> + :), it will also lead to more uniform doc.

Simpler is always an argument worth considering, but simpler isn't
always better.  Experience with other languages has taught me to be
carefule when making the kind or amount of whitespace significant.

Kind of whitespace is not a usability issue with your proposal, because
your patch rejects anything but space.

Amount of whitespace is, because getting it wrong turns a symbol comment
block into a free-form comment block, or a parameter section into
free-form lines belonging to whatever section precedes it.

As long as we reject definitions without a symbol comment, errors of the
former kind will still be caught, although the error message could be
confusing.

As is, we don't reject definitions whose symbol comment fails to
document all parameters.  Errors of the latter kind will therefore go
undetected.  Trap for the unwary.  See also "@param2 is True" below.

If we can't find a solution without such usability issues now, then I'm
willing to settle for documentation warning unwary users, or even for a
FIXME comment.

> > Doesn't recognize the symbol when there's crap after ":".  Confusing
> > when it's invisible crap, i.e. trailing whitespace.  In general, it's
> > best to parse left to right, like this:
> >
> >            elif not self.body.content and line.startswith('@'):
> >                match = valid_name.match(line[1:]);
> >                if not match:
> >                    raise QAPISchemaError(self.parser, 'Invalid name')
> >                if line.startswith(':', match.end(0)):
> >                    raise QAPISchemaError(self.parser, 'Expected ":")
> 
> Easier is simply to check :
> if not line.endswith(":"):
>                 raise QAPIParseError(self.parser, "Line should end with :")
> 
> test added

There's a reason we have a mountain of theory on parsing, built in
decades of practice.  What starts "easier" than that generally ends up
rewritten from scratch.  Anyway, let's continue.

> > Here, the error position is clearly suboptimal: it's the beginning of
> > the comment instead of where the actual error is, namely the beginning /
> > the end of the name.  More of the same below.  The root cause is
> > stacking parsers.  I'm not asking you to do anything about it at this
> > time.
> >
> > Note my use of "name" rather than "symbol" in the error message.
> > qapi-code-gen.txt talks about names, not symbols.  Consistent use of
> > terminology matters.
> 
> Agree, isn't "symbol" more appropriate for documentation though?

Possibly.  But if it's an improvement for generated documentation, it
surely is an improvement for hand-written documentation such as
qapi-code-gen.txt and the comments in qapi*.py, too.  And the Python
identifiers.  I have to admit that cools my enthusiasm for making the
improvement right now :)

> >> +        else:
> >> +            self._append_freeform(line)
> >> +
> >> +    def _append_symbol_line(self, line):
> >> +        name = line.split(' ', 1)[0]
> >> +
> >> +        if name.startswith("@") and name.endswith(":"):
> >> +            line = line[len(name)+1:]
> >> +            self._start_args_section(name[1:-1])
> >
> > Similar issue.  Left to right:
> >
> >            if re.match(r'\s*@', line):
> >                match = valid_name.match(line[1:]);
> >                if not match:
> >                    raise QAPISchemaError(self.parser, 'Invalid parameter 
> > name')
> >                line = line[match.end(0):]
> >                if line.startswith(':'):
> >                    raise QAPISchemaError(self.parser, 'Expected ":")
> >                line = line[1:]
> >                self._start_args_section(match.group(0))
> >
> > Even better would be factoring out the common code to parse '@foo:'.
> 
> It's a valid case to have a section freeform comment starting with @foo, ex:
> 
> @param: do this and than if
>                @param2 is True

Good point, but it's a language issue, not a parsing issue.

The language issue is that we need to recognize a line "# @param: ..."
as the start of a parameter section, while still permitting free-form
lines like "# @param2 is True".

This is a pretty convincing argument for requiring the ':'.  It can also
be used as an argument for requiring exactly one space between # and @.

How the language is parsed is completely orthogonal.

>  I don't think it's worth to factor out the code to parse @foo vs @foo:, yet
> 
> >> +        elif name in ("Returns:", "Since:",
> >> +                      # those are often singular or plural
> >> +                      "Note:", "Notes:",
> >> +                      "Example:", "Examples:"):
> >> +            line = line[len(name)+1:]
> >> +            self._start_meta_section(name[:-1])
> >
> > Since we're re.match()ing already, here's how do to it that way:
> 
> I don't follow your suggestion for the reason above,so I still have the
> first word 'name'
> 
> >            else
> >                match = re.match(r'\s*(Returns|Since|Notes?|Examples?):', 
> > line)
> >                if match:
> >                    line = line[match.end(0):]
> >                    self._start_meta_section(match.group(1))
> 
> not really much nicer to me (especially because no match yet), I'll keep
> the current code for now
> 
> >> +
> >> +        self._append_freeform(line)
> >> +
> >> +    def _start_args_section(self, name):
> >> +        if not name:
> >> +            raise QAPISchemaError(self.parser, "Invalid argument name")
> >
> > parameter name
> 
> ok
> 
> >> +        if name in self.args:
> >> +            raise QAPISchemaError(self.parser, "'%s' arg duplicated" % 
> >> name)
> >
> > Duplicate parameter name
> 
> ok
> 
> >> +        self.section = QAPIDoc.ArgSection(name)
> >> +        self.args[name] = self.section
> >> +
> >> +    def _start_meta_section(self, name):
> >> +        if name in ("Returns", "Since") and self.has_meta(name):
> >> +            raise QAPISchemaError(self.parser,
> >> +                                  "Duplicated '%s' section" % name)
> >> +        self.section = QAPIDoc.Section(name)
> >> +        self.meta.append(self.section)
> >> +
> >> +    def _append_freeform(self, line):
> >> +        in_arg = isinstance(self.section, QAPIDoc.ArgSection)
> >> +        if in_arg and self.section.content and not 
> >> self.section.content[-1] \
> >> +           and line and not line[0].isspace():
> >
> > PEP8[3] prefers parenthesises over backslash:
> >
> >            if (in_arg and self.section.content and not 
> > self.section.content[-1]
> >                and line and not line[0].isspace()):
> 
> yes, thanks
> 
> >> +            # an empty line followed by a non-indented
> >> +            # comment ends the argument section
> >> +            self.section = self.body
> >> +            self._append_freeform(line)
> >> +            return
> >
> > Switching back to the self.body section like this reorders the
> > documentation text.  I still think this is a terrible idea.  A dumb
> > script is exceedingly unlikely to improve human-written doc comment text
> > by reordering.  In the rare case it does, the doc comment source should
> > be reordered.
> >
> > Here's an example where the doc generator happily creates unintelligible
> > garbage if I format CpuDefinitionInfo's doc comment in a slightly off
> > way:
> >
> >     ##
> >     # @CpuDefinitionInfo:
> >     #
> >     # Virtual CPU definition.
> >     #
> >     # @name: the name of the CPU definition
> >     # 
> >     # @migration-safe: #optional whether a CPU definition can be safely
> >     # used for migration in combination with a QEMU compatibility
> >     # machine when migrating between different QMU versions and between
> >     # hosts with different sets of (hardware or software)
> >     # capabilities.
> >     # 
> >     # If not provided, information is not available and callers should
> >     # not assume the CPU definition to be migration-safe. (since 2.8)
> >     #
> >     # @static: whether a CPU definition is static and will not change
> >     # depending on QEMU version, machine type, machine options and
> >     # accelerator options.  A static model is always
> >     # migration-safe. (since 2.8)
> >     #
> >     # @unavailable-features: #optional List of properties that prevent
> >     # the CPU model from running in the current host. (since 2.8)
> >     #
> >     # @unavailable-features is a list of QOM property names that
> >     # represent CPU model attributes that prevent the CPU from running.
> >     # If the QOM property is read-only, that means there's no known
> >     # way to make the CPU model run in the current host. Implementations
> >     # that choose not to provide specific information return the
> >     # property name "type".
> >     #
> >     # If the property is read-write, it means that it MAY be possible
> >     # to run the CPU model in the current host if that property is
> >     # changed. Management software can use it as hints to suggest or
> >     # choose an alternative for the user, or just to generate meaningful
> >     # error messages explaining why the CPU model can't be used.
> >     # If @unavailable-features is an empty list, the CPU model is
> >     # runnable using the current host and machine-type.
> >     # If @unavailable-features is not present, runnability
> >     # information for the CPU is not available.
> >     #
> >     # Since: 1.2.0
> >     ##
> >     { 'struct': 'CpuDefinitionInfo',
> >       'data': { 'name': 'str', '*migration-safe': 'bool', 'static': 'bool',
> >                 '*unavailable-features': [ 'str' ] } }
> >
> > To detect the problem, you have to read the generated docs attentively.
> > You know my opinion on our chances for that to happen during
> > development.
> >
> > My point is not that we should support this doc comment format.  My
> > point is that people could conceivably write something like it, and not
> > get caught in patch review.
> >
> > I can see three ways out of this swamp:
> >
> > 1. Let sections continue until another one begins.
> 
> but we have interleaved  sections, and no explicit "body" section tag, ex:
> 
> ##
> # @input-send-event:
> #
> # Send input event(s) to guest.
> #
> # @device: #optional display device to send event(s) to.
> # @head: #optional head to send event(s) to, in case the
> #        display device supports multiple scanouts.
> # @events: List of InputEvent union.
> #
> # Returns: Nothing on success.
> #
> # The @device and @head parameters can be used to send the input event
> # to specific input devices in case (a) multiple input devices of the
> # same kind are added to the virtual machine and (b) you have
> # configured input routing (see docs/multiseat.txt) for those input
> # ....
> 
> > 2. Make text between sections an error.
> >
> > 3. When a section ends, start a new anonymous section.
> 
> It's not clear how to recognize when a section ends and append to comment
> body.
> 
> 2. Make text between sections an error.
> 
> Sound like the best option to me. I'll fix the doc and add a check & test
> for this common pattern though:
> 
> ##
> # @TestStruct:
> #
> # body with @var
> #
> # @integer: foo
> #           blah
> #
> #           bao
> #
> # Let's catch this bad comment.
> ##

Go ahead.

> > Can't say offhand which one will work best.
> >
> >> +        if in_arg or not self.section.name.startswith("Example"):
> >> +            line = line.strip()
> >
> > Stripping whitespace is not "Markdown-like", because intendation carries
> > meaning in Markdown.  Example:
> >
> > * First item in itemized list
> > * Second item
> >     * Sub-item of second item
> >     * Another sub-item
> > * Third item
> >
> > Stripping whitespace destroys the list structure.  If that's what you
> > want, you get to document where your "Markdown-like" markup is unlike
> > Markdown :)
> 
> let's use "comment annotations" instead

Okay.

> > Is there a technical reason for stripping whitespace?
> >
> > See also discussion of space after # above.
> >
> >> +        self.section.append(line)
> >> +
> >> +
> >>  class QAPISchemaParser(object):
> >>  
> >>      def __init__(self, fp, previously_included=[], incl_info=None):
> >> @@ -137,11 +240,18 @@ class QAPISchemaParser(object):
> >>          self.line = 1
> >>          self.line_pos = 0
> >>          self.exprs = []
> >> +        self.docs = []
> >>          self.accept()
> >>  
> >>          while self.tok is not None:
> >>              info = {'file': fname, 'line': self.line,
> >>                      'parent': self.incl_info}
> >> +            if self.tok == '#' and self.val.startswith('##'):
> >
> > How can self.val *not* start with '##' here?
> 
> you are right, unnecessary condition since we break get_expr() only in
> this case now
> 
> >> +                doc = self.get_doc()
> >> +                doc.info = info
> >
> > Let's pass info as argument to get_doc(), so we don't have to dot into
> > doc here.  get_doc() can avoid dotting into doc by passing info to the
> > constructor.
> 
> ok
> 
> >> +                self.docs.append(doc)
> >> +                continue
> >> +
> >>              expr = self.get_expr(False)
> >>              if isinstance(expr, dict) and "include" in expr:
> >>                  if len(expr) != 1:
> >> @@ -160,6 +270,7 @@ class QAPISchemaParser(object):
> >>                          raise QAPILineError(info, "Inclusion loop for %s"
> >>                                              % include)
> >>                      inf = inf['parent']
> >> +
> >>                  # skip multiple include of the same file
> >>                  if incl_abs_fname in previously_included:
> >>                      continue
> >> @@ -171,12 +282,38 @@ class QAPISchemaParser(object):
> >>                  exprs_include = QAPISchemaParser(fobj, 
> >> previously_included,
> >>                                                   info)
> >>                  self.exprs.extend(exprs_include.exprs)
> >> +                self.docs.extend(exprs_include.docs)
> >>              else:
> >>                  expr_elem = {'expr': expr,
> >>                               'info': info}
> >> +                if self.docs and not self.docs[-1].expr:
> >> +                    self.docs[-1].expr = expr
> >> +                    expr_elem['doc'] = self.docs[-1]
> >> +
> >
> > Attaches the expression to the last doc comment that doesn't already
> > have one.  A bit sloppy, because there could be non-doc comments in
> > between, or the doc comment could be in another file.  It'll do, at
> > least for now.
> 
> I extended the condition to check it attaches the doc from the same file
> 
> >>                  self.exprs.append(expr_elem)
> >>  
> >> -    def accept(self):
> >> +    def get_doc(self):
> >> +        if self.val != '##':
> >> +            raise QAPISchemaError(self, "Junk after '##' at start of "
> >> +                                  "documentation comment")
> >> +
> >> +        doc = QAPIDoc(self)
> >> +        self.accept(False)
> >> +        while self.tok == '#':
> >> +            if self.val.startswith('##'):
> >> +                # End of doc comment
> >> +                if self.val != '##':
> >> +                    raise QAPISchemaError(self, "Junk after '##' at end 
> >> of "
> >> +                                          "documentation comment")
> >> +                self.accept()
> >> +                return doc
> >> +            else:
> >> +                doc.append(self.val)
> >> +            self.accept(False)
> >> +
> >> +        raise QAPISchemaError(self, "Documentation comment must end with 
> >> '##'")
> >
> > Let's put this after accept, next to the other get_FOO().
> 
> ok
> 
> >> +
> >> +    def accept(self, skip_comment=True):
> >>          while True:
> >>              self.tok = self.src[self.cursor]
> >>              self.pos = self.cursor
> >> @@ -184,7 +321,13 @@ class QAPISchemaParser(object):
> >>              self.val = None
> >>  
> >>              if self.tok == '#':
> >> +                if self.src[self.cursor] == '#':
> >> +                    # Start of doc comment
> >> +                    skip_comment = False
> >>                  self.cursor = self.src.find('\n', self.cursor)
> >> +                if not skip_comment:
> >> +                    self.val = self.src[self.pos:self.cursor]
> >> +                    return
> >>              elif self.tok in "{}:,[]":
> >>                  return
> >>              elif self.tok == "'":
> >
> > Copied from review of v3, so I don't forget:
> >
> > Comment tokens are thrown away as before, except when the parser asks
> > for them by passing skip_comment=False, or when the comment token starts
> > with ##.  The parser asks while parsing a doc comment, in get_doc().
> >
> > This is a backchannel from the parser to the lexer.  I'd rather avoid
> > such lexer hacks, but I guess we can address that on top.
> 
> I don't have a good solution to that

I have an idea which may or may not work.  I can explore it on top.

> > A comment starting with ## inside an expression is now a syntax error.
> > For instance, input
> >
> >     {
> >     ##
> >
> > yields
> >
> >     /dev/stdin:2:1: Expected string or "}"
> >
> > Rather unfriendly error message, but we can fix that on top.
> >
> >> @@ -713,7 +856,7 @@ def check_keys(expr_elem, meta, required, optional=[]):
> >>                                  % (key, meta, name))
> >>  
> >>  
> >> -def check_exprs(exprs):
> >> +def check_exprs(exprs, strict_doc):
> >
> > Note: strict_doc=False unless this is qapi2texi.py.
> 
> For tests reasons: we may want to fix the test instead, or have a flag
> QAPI_CHECK/NOSTRICT_DOC=1 or an option.
> 
> >>      global all_names
> >>  
> >>      # Learn the types and check for valid expression keys
> >> @@ -722,6 +865,11 @@ def check_exprs(exprs):
> >>      for expr_elem in exprs:
> >>          expr = expr_elem['expr']
> >>          info = expr_elem['info']
> >> +
> >> +        if strict_doc and 'doc' not in expr_elem:
> >> +            raise QAPILineError(info,
> >> +                                "Expression missing documentation 
> >> comment")
> >> +
> >
> > Why should we supress this error in programs other than qapi2texi.py?
> 
> because the tests don't have comments

Good point :)

Can we come up with a brief comment explaining this?

> > Can't see a test for this one.
> 
> because tests aren't strict :)
> 
> I guess you'll tell me to fix the tests instead, so I did that in the next
> series.
> 
> >>          if 'enum' in expr:
> >>              check_keys(expr_elem, 'enum', ['data'], ['prefix'])
> >>              add_enum(expr['enum'], info, expr['data'])
> >> @@ -780,6 +928,63 @@ def check_exprs(exprs):
> >>      return exprs
> >>  
> >>  
> >> +def check_simple_doc(doc):
> >
> > You call this "free-form" elsewhere.  Pick one name and stick to it.
> > I think free-form is more descriptive than simple.
> 
> ok
> 
> >> +    if doc.symbol:
> >> +        raise QAPILineError(doc.info,
> >> +                            "'%s' documention is not followed by the 
> >> definition"
> >> +                            % doc.symbol)
> >
> > "Documentation for %s is not ..."
> 
> ok
> 
> >> +
> >> +    body = str(doc.body)
> >> +    if re.search(r'@\S+:', body, re.MULTILINE):
> >> +        raise QAPILineError(doc.info,
> >> +                            "Document body cannot contain @NAME: 
> >> sections")
> >> +
> >> +
> >> +def check_expr_doc(doc, expr, info):
> >
> > You call this "symbol" elsewhere.  I think "definition" would be better
> > than either.
> 
> ok
> 
> >> +    for i in ('enum', 'union', 'alternate', 'struct', 'command', 'event'):
> >> +        if i in expr:
> >> +            meta = i
> >> +            break
> >> +
> >> +    name = expr[meta]
> >> +    if doc.symbol != name:
> >> +        raise QAPILineError(info, "Definition of '%s' follows 
> >> documentation"
> >> +                            " for '%s'" % (name, doc.symbol))
> >> +    if doc.has_meta('Returns') and 'command' not in expr:
> >> +        raise QAPILineError(info, "Invalid return documentation")
> >
> > Suggest something like "'Returns:' is only valid for commands".
> >
> > We accept 'Returns:' even when the command doesn't return anything,
> > because we currently use it to document errors, too.  Can't say I like
> > that, but it's out of scope here.
> 
> ok
> 
> >> +
> >> +    doc_args = set(doc.args.keys())
> >> +    if meta == 'union':
> >> +        data = expr.get('base', [])
> >> +    else:
> >> +        data = expr.get('data', [])
> >> +    if isinstance(data, dict):
> >> +        data = data.keys()
> >> +    args = set([name.strip('*') for name in data])
> >
> > Not fixed since v3:
> >
> > * Flat union where 'base' is a string, e.g. union UserDefFlatUnion in
> >   qapi-schema-test.json has base 'UserDefUnionBase', args is set(['a',
> >   'B', 'e', 'D', 'f', 'i', 'o', 'n', 's', 'r', 'U'])
> >
> > * Command where 'data' is a string, e.g. user_def_cmd0 in
> >   qapi-schema-test.json has data 'Empty2', args is set(['E', 'm', 'p',
> >   '2', 't', 'y'])
> >
> > * Event where 'data' is a string, no test case handy (hole in test
> >   coverage)
> 
> ok, I changed it that way, that fixes it:
> +    if isinstance(data, list):
> +        args = set([name.strip('*') for name in data])
> +    else:
> +        args = set()

Uh, sure this does the right thing when data is a dict?

> >> +    if meta == 'alternate' or \
> >> +       (meta == 'union' and not expr.get('discriminator')):
> >> +        args.add('type')
> >
> > As explained in review of v3, this is only a subset of the real set of
> > members.  Computing the exact set is impractical when working with the
> > abstract syntax tree.  I believe we'll eventually have to rewrite this
> > code to work with the QAPISchemaEntity instead.
> 
> I don't think we want to list all the members as this would lead to
> duplicated documentation. Instead it should document only the members of
> the expr being defined.

You're sticking to established practice, which makes lots of sense.
However, established practice is a lot more clear for simple cases than
for things like members inherited from base types, variant members and
such.  At some point, we'll have to figure out how we want not so simple
cases documented.  Until then, we can't really decide how to check
documentation completeness.

>                         In which case, it looks like this check is good
> enough, no?

For now, yes.  Later on, maybe.

Let's document the limitation in a comment and the commit message, and
move on.

> >> +    if not doc_args.issubset(args):
> >> +        raise QAPILineError(info, "Members documentation is not a subset 
> >> of"
> >> +                            " API %r > %r" % (list(doc_args), list(args)))
> >> +
> >> +
> >> +def check_docs(docs):
> >> +    for doc in docs:
> >> +        for section in doc.args.values() + doc.meta:
> >> +            content = str(section)
> >> +            if not content or content.isspace():
> >> +                raise QAPILineError(doc.info,
> >> +                                    "Empty doc section '%s'" % 
> >> section.name)
> >> +
> >> +        if not doc.expr:
> >> +            check_simple_doc(doc)
> >> +        else:
> >> +            check_expr_doc(doc, doc.expr, doc.info)
> >> +
> >> +    return docs
> >> +
> >> +
> >>  #
> >>  # Schema compiler frontend
> >>  #
> >> @@ -1249,9 +1454,11 @@ class QAPISchemaEvent(QAPISchemaEntity):
> >>  
> >>  
> >>  class QAPISchema(object):
> >> -    def __init__(self, fname):
> >> +    def __init__(self, fname, strict_doc=False):
> >>          try:
> >> -            self.exprs = check_exprs(QAPISchemaParser(open(fname, 
> >> "r")).exprs)
> >> +            parser = QAPISchemaParser(open(fname, "r"))
> >> +            self.exprs = check_exprs(parser.exprs, strict_doc)
> >> +            self.docs = check_docs(parser.docs)
> >>              self._entity_dict = {}
> >>              self._predefining = True
> >>              self._def_predefineds()
> >> diff --git a/scripts/qapi2texi.py b/scripts/qapi2texi.py
> >> new file mode 100755
> >> index 0000000..0cec43a
> >> --- /dev/null
> >> +++ b/scripts/qapi2texi.py
> >
> > Still only skimming this one.
> >
> >> @@ -0,0 +1,331 @@
> >> +#!/usr/bin/env python
> >> +# QAPI texi generator
> >> +#
> >> +# This work is licensed under the terms of the GNU LGPL, version 2+.
> >> +# See the COPYING file in the top-level directory.
> >> +"""This script produces the documentation of a qapi schema in texinfo 
> >> format"""
> >> +import re
> >> +import sys
> >> +
> >> +import qapi
> >> +
> >> +COMMAND_FMT = """
> >> address@hidden {type} {{{ret}}} {name} @
> >> +{{{args}}}
> >> +
> >> +{body}
> >> +
> >> address@hidden deftypefn
> >> +
> >> +""".format
> >> +
> >> +ENUM_FMT = """
> >> address@hidden Enum {name}
> >> +
> >> +{body}
> >> +
> >> address@hidden deftp
> >> +
> >> +""".format
> >> +
> >> +STRUCT_FMT = """
> >> address@hidden {{{type}}} {name} @
> >> +{{{attrs}}}
> >> +
> >> +{body}
> >> +
> >> address@hidden deftp
> >> +
> >> +""".format
> >> +
> >> +EXAMPLE_FMT = """@example
> >> +{code}
> >> address@hidden example
> >> +""".format
> >> +
> >> +
> >> +def subst_strong(doc):
> >> +    """Replaces *foo* by @strong{foo}"""
> >> +    return re.sub(r'\*([^_\n]+)\*', r'@emph{\1}', doc)
> >> +
> >> +
> >> +def subst_emph(doc):
> >> +    """Replaces _foo_ by @emph{foo}"""
> >> +    return re.sub(r'\s_([^_\n]+)_\s', r' @emph{\1} ', doc)
> >> +
> >> +
> >> +def subst_vars(doc):
> >> +    """Replaces @var by @code{var}"""
> >> +    return re.sub(r'@([\w-]+)', r'@code{\1}', doc)
> >> +
> >> +
> >> +def subst_braces(doc):
> >> +    """Replaces {} with @{ @}"""
> >> +    return doc.replace("{", "@{").replace("}", "@}")
> >> +
> >> +
> >> +def texi_example(doc):
> >> +    """Format @example"""
> >> +    doc = subst_braces(doc).strip('\n')
> >> +    return EXAMPLE_FMT(code=doc)
> >> +
> >> +
> >> +def texi_comment(doc):
> >> +    """
> >> +    Format a comment
> >> +
> >> +    Lines starting with:
> >> +    - |: generates an @example
> >> +    - =: generates @section
> >> +    - ==: generates @subsection
> >> +    - 1. or 1): generates an @enumerate @item
> >> +    - o/*/-: generates an @itemize list
> >> +    """
> >> +    lines = []
> >> +    doc = subst_braces(doc)
> >> +    doc = subst_vars(doc)
> >> +    doc = subst_emph(doc)
> >> +    doc = subst_strong(doc)
> >> +    inlist = ""
> >> +    lastempty = False
> >> +    for line in doc.split('\n'):
> >> +        empty = line == ""
> >> +
> >> +        if line.startswith("| "):
> >> +            line = EXAMPLE_FMT(code=line[2:])
> >> +        elif line.startswith("= "):
> >> +            line = "@section " + line[2:]
> >> +        elif line.startswith("== "):
> >> +            line = "@subsection " + line[3:]
> >> +        elif re.match("^([0-9]*[.)]) ", line):
> >> +            if not inlist:
> >> +                lines.append("@enumerate")
> >> +                inlist = "enumerate"
> >> +            line = line[line.find(" ")+1:]
> >> +            lines.append("@item")
> >> +        elif re.match("^[o*-] ", line):
> >> +            if not inlist:
> >> +                lines.append("@itemize %s" % {'o': "@bullet",
> >> +                                              '*': "@minus",
> >> +                                              '-': ""}[line[0]])
> >> +                inlist = "itemize"
> >> +            lines.append("@item")
> >> +            line = line[2:]
> >> +        elif lastempty and inlist:
> >> +            lines.append("@end %s\n" % inlist)
> >> +            inlist = ""
> >> +
> >> +        lastempty = empty
> >> +        lines.append(line)
> >> +
> >> +    if inlist:
> >> +        lines.append("@end %s\n" % inlist)
> >> +    return "\n".join(lines)
> >> +
> >> +
> >> +def texi_args(expr, key="data"):
> >> +    """
> >> +    Format the functions/structure/events.. arguments/members
> >> +    """
> >> +    if key not in expr:
> >> +        return ""
> >> +
> >> +    args = expr[key]
> >> +    arg_list = []
> >> +    if isinstance(args, str):
> >> +        arg_list.append(args)
> >> +    else:
> >> +        for name, typ in args.iteritems():
> >> +            # optional arg
> >> +            if name.startswith("*"):
> >> +                name = name[1:]
> >> +                arg_list.append("['%s': @var{%s}]" % (name, typ))
> >> +            # regular arg
> >> +            else:
> >> +                arg_list.append("'%s': @var{%s}" % (name, typ))
> >
> > Inappropriate use of @var.  @var is for metasyntactic variables,
> > i.e. something that stands for another piece of text.  typ isn't, it's
> > the name of a specific QAPI type.  I think you should use @code.
> 
> yes
> 
> > This is the reason why the type names in qemu-qmp-ref.txt are often
> > mangled, e.g.
> >
> >  -- Struct: VersionInfo { 'qemu': VERSIONTRIPLE, 'package': STR }
> 
> Right, we have format issue here. If we change it for @code, we get
> additional quotes in the text format. The simpler is to use no format or
> @t{}

Pick something you like.

> >> +
> >> +    return ", ".join(arg_list)
> >> +
> >> +
> >> +def texi_body(doc):
> >> +    """
> >> +    Format the body of a symbol documentation:
> >> +    - a table of arguments
> >> +    - followed by "Returns/Notes/Since/Example" sections
> >> +    """
> >> +    def _section_order(section):
> >> +        return {"Returns": 0,
> >> +                "Note": 1,
> >> +                "Notes": 1,
> >> +                "Since": 2,
> >> +                "Example": 3,
> >> +                "Examples": 3}[section]
> >> +
> >> +    body = "@table @asis\n"
> >> +    for arg, section in doc.args.iteritems():
> >> +        desc = str(section)
> >> +        opt = ''
> >> +        if desc.startswith("#optional"):
> >> +            desc = desc[10:]
> >> +            opt = ' *'
> >> +        elif desc.endswith("#optional"):
> >> +            desc = desc[:-10]
> >> +            opt = ' *'
> >> +        body += "@item @code{'%s'}%s\n%s\n" % (arg, opt, 
> >> texi_comment(desc))
> >> +    body += "@end table\n"
> >> +    body += texi_comment(str(doc.body))
> >> +
> >> +    meta = sorted(doc.meta, key=lambda i: _section_order(i.name))
> >> +    for section in meta:
> >> +        name, doc = (section.name, str(section))
> >> +        func = texi_comment
> >> +        if name.startswith("Example"):
> >> +            func = texi_example
> >> +
> >> +        body += "address@hidden address@hidden quotation" % \
> >> +                (name, func(doc))
> >> +    return body
> >> +
> >> +
> >> +def texi_alternate(expr, doc):
> >> +    """
> >> +    Format an alternate to texi
> >> +    """
> >> +    args = texi_args(expr)
> >> +    body = texi_body(doc)
> >> +    return STRUCT_FMT(type="Alternate",
> >> +                      name=doc.symbol,
> >> +                      attrs="[ " + args + " ]",
> >> +                      body=body)
> >> +
> >> +
> >> +def texi_union(expr, doc):
> >> +    """
> >> +    Format an union to texi
> >
> > I think it's "a union".
> 
> ok
> 
> >> +    """
> >> +    attrs = "@{ " + texi_args(expr, "base") + " @}"
> >> +    discriminator = expr.get("discriminator")
> >> +    if not discriminator:
> >> +        union = "Flat Union"
> >> +        discriminator = "type"
> >> +    else:
> >> +        union = "Union"
> >
> > Condition is backwards.
> 
> fixed
> 
> >> +    attrs += " + '%s' = [ " % discriminator
> >> +    attrs += texi_args(expr, "data") + " ]"
> >> +    body = texi_body(doc)
> >> +
> >> +    return STRUCT_FMT(type=union,
> >> +                      name=doc.symbol,
> >> +                      attrs=attrs,
> >> +                      body=body)
> >
> > You're inventing syntax here.  Example output:
> >
> >  -- Union: QCryptoBlockOpenOptions { QCryptoBlockOptionsBase } +
> >           'format' = [ 'qcow': QCRYPTOBLOCKOPTIONSQCOW, 'luks':
> >           QCRYPTOBLOCKOPTIONSLUKS ]
> >
> >      The options that are available for all encryption formats when
> >      opening an existing volume
> >           Since: 2.6
> >
> >  -- Flat Union: ImageInfoSpecific { } + 'type' = [ 'qcow2':
> >           IMAGEINFOSPECIFICQCOW2, 'vmdk': IMAGEINFOSPECIFICVMDK, 'luks':
> >           QCRYPTOBLOCKINFOLUKS ]
> >
> >      A discriminated record of image format specific information
> >      structures.
> >           Since: 1.7
> >
> > Note that QCryptoBlockOpenOptions is actually a flat union, and
> > ImageInfoSpecific a simple union.  As I said, the condition is
> > backwards.
> >
> > The meaning of the attrs part is unobvious.  Familiarity with schema
> > syntax doesn't really help.
> >
> > Could we simply use schema syntax here?
> 
> Union: QCryptoBlockOpenOptions {
> 'base': QCryptoBlockOptionsBase,
> 'discriminator': 'format',
> 'data':  { 'qcow': QCryptoBlockOptionsQCow, 'luks':
> QCryptoBlockCreateOptionsLUKS }
> 
> }
> 
> Doesn't look obvious either and pollute the documentation with
> schema-specific parameters.

The schema syntax for flat unions is pretty horrid :)  I hope to improve
it, but there's more urgent fish to fry.

> > If not: whatever format you use, you first need to explain it.
> 
> I am not sure my solution is the best and will remain, but ok let's try to
> document it for now.

Before you pour time into documenting what you have, we should probably
discuss what we need.

> >> +
> >> +
> >> +def texi_enum(expr, doc):
> >> +    """
> >> +    Format an enum to texi
> >> +    """
> >> +    for i in expr['data']:
> >> +        if i not in doc.args:
> >> +            doc.args[i] = ''
> >> +    body = texi_body(doc)
> >> +    return ENUM_FMT(name=doc.symbol,
> >> +                    body=body)
> >> +
> >> +
> >> +def texi_struct(expr, doc):
> >> +    """
> >> +    Format a struct to texi
> >> +    """
> >> +    args = texi_args(expr)
> >> +    body = texi_body(doc)
> >> +    attrs = "@{ " + args + " @}"
> >> +    base = expr.get("base")
> >> +    if base:
> >> +        attrs += " + %s" % base
> >> +    return STRUCT_FMT(type="Struct",
> >> +                      name=doc.symbol,
> >> +                      attrs=attrs,
> >> +                      body=body)
> >
> > More syntax invention.  Example output:
> >
> >  -- Struct: BlockdevOptionsReplication { 'mode': REPLICATIONMODE,
> >           ['top-id': STR] } + BlockdevOptionsGenericFormat
> >
> >      ''mode''
> >           the replication mode
> >      ''top-id'' *
> >           In secondary mode, node name or device ID of the root node who
> >           owns the replication node chain.  Must not be given in primary
> >           mode.
> >      Driver specific block device options for replication
> >           Since: 2.8
> >
> > Meaning of the attrs part is perhaps more guessable here, but it's still
> > guesswork.
> >
> > The meaning of * after ''top-id'' is also unobvious.
> >
> > Note the redundancy between the attrs part and the body: both state
> > member names and optionalness.  The body doesn't state member types and
> > base type.  If we fixed that, we could drop the attrs part and save us
> > the trouble of defining and explaining a syntax for it.
> >
> > Let me take a step back.  This document is about the QMP wire format.
> > There are no such things as simple and flat unions on the wire, only
> > JSON objects.  QMP introspection duly describes a type's JSON objects,
> > not how it's defined in QAPI.  I think QMP documentation should ideally
> > do the same.
> >
> > QMP introspection uses a common structure for struct, simple and flat
> > union: common members, variants, and if variants, then the common member
> > that serves as tag.  See introspect.json for details.
> >
> > Base types are flattened away.  Appropriate for introspection, but
> > documentation shouldn't do that.
> >
> > I wrote "ideally" because it's probably too big a step.  I'm willing to
> > settle for something less than ideal.
> 
> I don't have clear idea what to do here, so if we can leave that for later,
> that would be nice for me (I am already spending more time than I imagined
> I would on doc stuff)

Me too, if that's any consolation...

Perhaps I can find a bit of time to think so I can propose what we could
do.

> >> +
> >> +
> >> +def texi_command(expr, doc):
> >> +    """
> >> +    Format a command to texi
> >> +    """
> >> +    args = texi_args(expr)
> >> +    ret = expr["returns"] if "returns" in expr else ""
> >> +    body = texi_body(doc)
> >> +    return COMMAND_FMT(type="Command",
> >> +                       name=doc.symbol,
> >> +                       ret=ret,
> >> +                       args="(" + args + ")",
> >> +                       body=body)
> >> +
> >> +
> >> +def texi_event(expr, doc):
> >> +    """
> >> +    Format an event to texi
> >> +    """
> >> +    args = texi_args(expr)
> >> +    body = texi_body(doc)
> >> +    return COMMAND_FMT(type="Event",
> >> +                       name=doc.symbol,
> >> +                       ret="",
> >> +                       args="(" + args + ")",
> >> +                       body=body)
> >> +
> >> +
> >> +def texi_expr(expr, doc):
> >> +    """
> >> +    Format an expr to texi
> >> +    """
> >> +    (kind, _) = expr.items()[0]
> >> +
> >> +    fmt = {"command": texi_command,
> >> +           "struct": texi_struct,
> >> +           "enum": texi_enum,
> >> +           "union": texi_union,
> >> +           "alternate": texi_alternate,
> >> +           "event": texi_event}
> >> +    try:
> >> +        fmt = fmt[kind]
> >> +    except KeyError:
> >> +        raise ValueError("Unknown expression kind '%s'" % kind)
> >
> > The try / except converts one kind of error into another.  What does
> > that buy us?  As far as I can tell, this shouldn't ever happen.
> 
> dropped
> 
> >> +
> >> +    return fmt(expr, doc)
> >> +
> >> +
> >> +def texi(docs):
> >> +    """
> >> +    Convert QAPI schema expressions to texi documentation
> >> +    """
> >> +    res = []
> >> +    for doc in docs:
> >> +        expr = doc.expr
> >> +        if not expr:
> >> +            res.append(texi_body(doc))
> >> +            continue
> >> +        try:
> >> +            doc = texi_expr(expr, doc)
> >> +            res.append(doc)
> >> +        except:
> >> +            print >>sys.stderr, "error at @%s" % doc.info
> >> +            raise
> >> +
> >> +    return '\n'.join(res)
> >> +
> >> +
> >> +def main(argv):
> >> +    """
> >> +    Takes schema argument, prints result to stdout
> >> +    """
> >> +    if len(argv) != 2:
> >> +        print >>sys.stderr, "%s: need exactly 1 argument: SCHEMA" % 
> >> argv[0]
> >> +        sys.exit(1)
> >> +
> >> +    schema = qapi.QAPISchema(argv[1], strict_doc=True)
> >> +    print texi(schema.docs)
> >> +
> >> +
> >> +if __name__ == "__main__":
> >> +    main(sys.argv)
> >> diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
> >> index 2841c51..8bc963e 100644
> >> --- a/docs/qapi-code-gen.txt
> >> +++ b/docs/qapi-code-gen.txt
> >> @@ -45,16 +45,13 @@ QAPI parser does not).  At present, there is no place 
> >> where a QAPI
> >>  schema requires the use of JSON numbers or null.
> >>  
> >>  Comments are allowed; anything between an unquoted # and the following
> >> -newline is ignored.  Although there is not yet a documentation
> >> -generator, a form of stylized comments has developed for consistently
> >> -documenting details about an expression and when it was added to the
> >> -schema.  The documentation is delimited between two lines of ##, then
> >> -the first line names the expression, an optional overview is provided,
> >> -then individual documentation about each member of 'data' is provided,
> >> -and finally, a 'Since: x.y.z' tag lists the release that introduced
> >> -the expression.  Optional members are tagged with the phrase
> >> -'#optional', often with their default value; and extensions added
> >> -after the expression was first released are also given a '(since
> >> +newline is ignored.  The documentation is delimited between two lines
> >> +of ##, then the first line names the expression, an optional overview
> >> +is provided, then individual documentation about each member of 'data'
> >> +is provided, and finally, a 'Since: x.y.z' tag lists the release that
> >> +introduced the expression.  Optional members are tagged with the
> >> +phrase '#optional', often with their default value; and extensions
> >> +added after the expression was first released are also given a '(since
> >>  x.y.z)' comment.  For example:
> >>  
> >>      ##
> >> @@ -73,12 +70,49 @@ x.y.z)' comment.  For example:
> >>      #           (Since 2.0)
> >>      #
> >>      # Since: 0.14.0
> >> +    #
> >> +    # Notes: You can also make a list:
> >> +    #        - with items
> >> +    #        - like this
> >> +    #
> >> +    # Example:
> >> +    #
> >> +    # -> { "execute": ... }
> >> +    # <- { "return": ... }
> >> +    #
> >>      ##
> >>      { 'struct': 'BlockStats',
> >>        'data': {'*device': 'str', 'stats': 'BlockDeviceStats',
> >>                 '*parent': 'BlockStats',
> >>                 '*backing': 'BlockStats'} }
> >
> > This example gives an idea of how to document struct types.  But the
> > reader is left guessing how to document other kinds of definitions.  In
> > particular, there's no mention of "Returns", "Note" and "Examples"
> > sections anywhere in this file.  As is, this could do for a tutorial,
> > but this file is a *reference*, not a tutorial.
> >
> > For a reference, we need to be more thorough.  A doc comments section on
> > its own seems advisable.
> >
> > I guess doc comment examples are best added to the schema code examples
> > in sections === Struct types ===, ..., === Events ===.
> >
> > Careful review for completeness is advised.
> 
> So far we did quite fine with the generic example. I am not convinced we
> need to have doc comments for all kinds of types, it looks redundant to me.

Well, so far we didn't require people to write well-formed doc comments.

Without such comments in qapi-code-gen.txt's examples, they become
incomplete.  Pity, because I like to feed them to the QAPI generators;
extracted like this:

    $ sed '1,/\$ cat example-schema.json/d;/^[^ ]/,$d;s/^    //' 
../docs/qapi-code-gen.txt >example-schema.json

> >> +It's also possible to create documentation sections, such as:
> >> +
> >> +    ##
> >> +    # = Section
> >> +    # == Subsection
> >> +    #
> >> +    # Some text foo with *strong* and _emphasis_
> >> +    # 1. with a list
> >> +    # 2. like that
> >> +    #
> >> +    # And some code:
> >> +    # | $ echo foo
> >> +    # | -> do this
> >> +    # | <- get that
> >> +    #
> >> +    ##
> >> +
> >> +Text *foo* and _foo_ are for "strong" and "emphasis" styles (they do
> >> +not work over multiple lines). @foo is used to reference a symbol.
> >> +
> >> +Lines starting with the following characters and a space:
> >> +- | are examples
> >> +- = are top section
> >> +- == are subsection
> >> +- X. or X) are enumerations (X is any number)
> >> +- o/*/- are itemized list
> >> +
> >
> > Is this doc markup documentation complete?
> 
> I think so
> 
> > Is it related to any existing text markup language?  Hmm, the commit
> > message calls it "Markdown-like".  Should we mention that here?
> 
> I'll use "annotations" instead in the doc
> 
> > Is all markup valid in all contexts?  For instance, is = Section valid
> > in symbol blocks?  What does | (or any markup for that matter) do within
> > an Example section?
> 
> Example is verbatim
> 
> I improved a bit the doc based on your feedback. I'd suggest we iterate
> with additional patches later, I could use help.

Let's mark places that need improvement with FIXME or TODO comments,
noted in commit messages.

> >>  The schema sets up a series of types, as well as commands and events
> >>  that will use those types.  Forward references are allowed: the parser
> >>  scans in two passes, where the first pass learns all type names, and
> >> diff --git a/tests/qapi-schema/doc-bad-args.err 
> >> b/tests/qapi-schema/doc-bad-args.err
> >> new file mode 100644
> >> index 0000000..a55e003
> >> --- /dev/null
> >> +++ b/tests/qapi-schema/doc-bad-args.err
> >> @@ -0,0 +1 @@
> >> +tests/qapi-schema/doc-bad-args.json:1: Members documentation is not a 
> >> subset of API ['a', 'b'] > ['a']
> >> diff --git a/tests/qapi-schema/doc-bad-args.exit 
> >> b/tests/qapi-schema/doc-bad-args.exit
> >> new file mode 100644
> >> index 0000000..d00491f
> >> --- /dev/null
> >> +++ b/tests/qapi-schema/doc-bad-args.exit
> >> @@ -0,0 +1 @@
> >> +1
> >> diff --git a/tests/qapi-schema/doc-bad-args.json 
> >> b/tests/qapi-schema/doc-bad-args.json
> >> new file mode 100644
> >> index 0000000..26992ea
> >> --- /dev/null
> >> +++ b/tests/qapi-schema/doc-bad-args.json
> >> @@ -0,0 +1,6 @@
> >> +##
> >> +# @foo:
> >> +# @a: a
> >> +# @b: b
> >> +##
> >> +{ 'command': 'foo', 'data': {'a': 'int'} }
> >
> > The existing tests/qapi-schema/*.json explain what is being tested in a
> > comment.  Perhaps like this:
> >
> >     # Arguments listed in the doc comment must exist in the actual schema
> >
> >     ##
> >     # @foo:
> >     ...
> >
> > Likewise for other tests.
> 
> ok
> 
[...]
> > [1] https://www.python.org/dev/peps/pep-0257/
> > [2] https://google.github.io/styleguide/pyguide.html
> > [3] https://www.python.org/dev/peps/pep-0008/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]