qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 3/6] qapi/parser: add type hint annotations (QAPIDoc)


From: John Snow
Subject: Re: [PATCH v2 3/6] qapi/parser: add type hint annotations (QAPIDoc)
Date: Tue, 28 Sep 2021 19:25:29 -0400



On Tue, Sep 7, 2021 at 6:44 AM Markus Armbruster <armbru@redhat.com> wrote:
John Snow <jsnow@redhat.com> writes:

> Annotations do not change runtime behavior.
>
> This commit adds mostly annotations, but uses a TYPE_CHECKING runtime
> check to conditionally import dependencies, which only triggers during
> runs of mypy.

Please add a reference to
https://mypy.readthedocs.io/en/latest/runtime_troubles.html#import-cycles


OK.
 
> Signed-off-by: John Snow <jsnow@redhat.com>
>
> ---
>
> TopLevelExpr, an idea from previous drafts, makes a return here in order
> to give a semantic meaning to check_expr(). The type is intended to be
> used more in forthcoming commits (pt5c), but I opted to include it now
> instead of creating yet-another Dict[str, object] type hint that I would
> forget to change later.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/parser.py | 77 ++++++++++++++++++++++++++----------------
>  1 file changed, 48 insertions(+), 29 deletions(-)
>
> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> index 3ddde318376..b1e2fa5c577 100644
> --- a/scripts/qapi/parser.py
> +++ b/scripts/qapi/parser.py
> @@ -18,6 +18,7 @@
>  import os
>  import re
>  from typing import (
> +    TYPE_CHECKING,
>      Dict,
>      List,
>      Optional,
> @@ -30,6 +31,15 @@
>  from .source import QAPISourceInfo


> +if TYPE_CHECKING:
> +    # pylint: disable=cyclic-import
> +    # TODO: Remove cycle. [schema -> expr -> parser -> schema]

WRT this todo: you mentioned that you'd prefer having some idea or plan for how to eliminate the cycle in order to let this band-aid fly. How about adding a 'doc' member to e.g. QAPISchemaFeature and doing the connection entirely inside of schema.py, and dropping connect_member() and connect_feature()?

Would that be serviceable?
 
> +    from .schema import QAPISchemaFeature, QAPISchemaMember
> +
> +
> +#: Represents a single Top Level QAPI schema _expression_.
> +TopLevelExpr = Dict[str, object]

Related: _ExprValue below, and _JSONObject in expr.py.  The latter's
comment gives the best rationale (except I don't get "the purpose of
this module is to interrogate that type").


in expr.py, the purpose of that module (expr) is explicitly to interrogate (check, validate) the shape of arbitrary JSON objects. I am saying that a more strict definition specifically here in expr.py is not necessary because the entire purpose of expr.py is to, at runtime, verify the shape of any such variables that might be annotated that way. I am drawing some distinction to introspect.py, where we're generating that data ourselves -- the stronger types are more viable there, because we know what they are already.
(again, sorry about mypy's lack of recursive typing, I hate it too, I promise)
 
I think we'd like to have

* A recursive type for JSON value (in our bastardized version of JSON)

  This is Union of bool, str, List[Self], Dict[str, Self].  It's what
  .get_expr() returns.

  Since mypy can't do recursive, we approximate with _ExprValue.

* A recursive type for JSON object

  This is the Dict branch of the above.  It's what check_keys() &
  friends take as argument.

  We approximate with _JSONObject.

  Same for the List branch would make sense if we had a use for the
  type.

* A recursive type for TOP-LEVEL-EXPR

  Actually the same type as the previous one, to be used only for the
  schema's top-level expressions.  It's the elements of
  QAPISchemaParser.exprs[], and what check_exprs() takes as argument.

  We approximate with TopLevelExpr, but so far use it only for
  check_exprs().

  Doesn't really improve type checking, but may serve as documentation.


That's the intended effect here -- to help highlight which functions operate on those top-level expressions, and which might be invoked in more arbitrary cases.
Consider also a hypothetical TOP-LEVEL-EXPR that is actually a bona-fide object with additional metadata, too. In these cases, the type really will be different.
 
Shouldn't these types be all defined in one place, namely right here?
Bonus: we need to explain the mypy sadness just once then.

Shouldn't their names be more systematic?  _ExprValue, _JSONObject and
TopLevelExpr hardly suggest any relation...


I drop _JSONObject in pt5c, it was a stop-gap. For the rest, I'll see about a more rigorous consolidation now that we're this far in.

pt5c was intended as a "cleanup" step that did some of that consolidation of nearly-redundant types; but I wanted all of parser.py under the mypy gun *first*.
Will you take a raincheck here and we'll focus on the consolidation in the next series? I agree it's worth doing.

(I can add a 'FIXME' that will 100% need to be fixed before I move scripts/qapi under python/qemu/qapi -- the linter config there prohibits them, so you can be sure I can't ignore it.)
 
> +
>  # Return value alias for get_expr().
>  _ExprValue = Union[List[object], Dict[str, object], str, bool]

> @@ -447,7 +457,8 @@ class QAPIDoc:
>      """

>      class Section:
> -        def __init__(self, parser, name=None, indent=0):
> +        def __init__(self, parser: QAPISchemaParser,
> +                     name: Optional[str] = None, indent: int = 0):
>              # parser, for error messages about indentation
>              self._parser = parser
>              # optional section name (argument/member or section name)
> @@ -459,7 +470,7 @@ def __init__(self, parser, name=None, indent=0):
>          def __bool__(self) -> bool:
>              return bool(self.name or self.text)

> -        def append(self, line):
> +        def append(self, line: str) -> None:
>              # Strip leading spaces corresponding to the expected indent level
>              # Blank lines are always OK.
>              if line:
> @@ -474,39 +485,40 @@ def append(self, line):
>              self.text += line.rstrip() + '\n'

>      class ArgSection(Section):
> -        def __init__(self, parser, name, indent=0):
> +        def __init__(self, parser: QAPISchemaParser,
> +                     name: Optional[str] = None, indent: int = 0):

Why not name: str?  All callers pass a str argument...


Maybe a holdover from when I was trying to stick to LSP for initializers. I think I have since learned that mypy will only whine about LSP if you hold pointers to the initializer to be invoked in factory functions. I'll fix it and we'll worry about the LSP warnings if they ever decide to show up.
 
>              super().__init__(parser, name, indent)
> -            self.member = None
> +            self.member: Optional['QAPISchemaMember'] = None

I guess you need to quote 'QAPISchemaMember', because we actually import
it only if TYPE_CHECKING.  More of the same below.


Yep. Future Python versions may be better about this (You don't have to quote any types, ever -- they're parsed in a later pass) but we just don't have that luxury yet.

See https://www.python.org/dev/peps/pep-0563/
Also see https://www.reddit.com/r/Python/comments/muyz5h/pep_563_getting_rolled_back_from_python_310/ for some discussion on why this feature was not made the default for 3.10

(TLDR: It breaks a library called pydantic that largely does exactly what expr.py does, using type annotations to define the valid shapes of objects. I'd love to use it in QAPI, actually, ... a discussion for yet another day.)
 

> -        def connect(self, member):
> +        def connect(self, member: 'QAPISchemaMember') -> None:
>              self.member = member

> -    def __init__(self, parser, info):
> +    def __init__(self, parser: QAPISchemaParser, info: QAPISourceInfo):
>          # self._parser is used to report errors with QAPIParseError.  The
>          # resulting error position depends on the state of the parser.
>          # It happens to be the beginning of the comment.  More or less
>          # servicable, but action at a distance.
>          self._parser = parser
>          self.info = info
> -        self.symbol = None
> +        self.symbol: Optional[str] = None
>          self.body = QAPIDoc.Section(parser)
>          # dict mapping parameter name to ArgSection
> -        self.args = OrderedDict()
> -        self.features = OrderedDict()
> +        self.args: Dict[str, QAPIDoc.ArgSection] = OrderedDict()
> +        self.features: Dict[str, QAPIDoc.ArgSection] = OrderedDict()
>          # a list of Section
> -        self.sections = []
> +        self.sections: List[QAPIDoc.Section] = []
>          # the current section
>          self._section = self.body
>          self._append_line = self._append_body_line

> -    def has_section(self, name):
> +    def has_section(self, name: str) -> bool:
>          """Return True if we have a section with this name."""
>          for i in self.sections:
>              if i.name == name:
>                  return True
>          return False

> -    def append(self, line):
> +    def append(self, line: str) -> None:
>          """
>          Parse a comment line and add it to the documentation.

> @@ -527,18 +539,18 @@ def append(self, line):
>          line = line[1:]
>          self._append_line(line)

> -    def end_comment(self):
> +    def end_comment(self) -> None:
>          self._end_section()

>      @staticmethod
> -    def _is_section_tag(name):
> +    def _is_section_tag(name: str) -> bool:
>          return name in ('Returns:', 'Since:',
>                          # those are often singular or plural
>                          'Note:', 'Notes:',
>                          'Example:', 'Examples:',
>                          'TODO:')

> -    def _append_body_line(self, line):
> +    def _append_body_line(self, line: str) -> None:
>          """
>          Process a line of documentation text in the body section.

> @@ -578,7 +590,7 @@ def _append_body_line(self, line):
>              # This is a free-form documentation block
>              self._append_freeform(line)

> -    def _append_args_line(self, line):
> +    def _append_args_line(self, line: str) -> None:
>          """
>          Process a line of documentation text in an argument section.

> @@ -624,7 +636,7 @@ def _append_args_line(self, line):

>          self._append_freeform(line)

> -    def _append_features_line(self, line):
> +    def _append_features_line(self, line: str) -> None:
>          name = line.split(' ', 1)[0]

>          if name.startswith('@') and name.endswith(':'):
> @@ -656,7 +668,7 @@ def _append_features_line(self, line):

>          self._append_freeform(line)

> -    def _append_various_line(self, line):
> +    def _append_various_line(self, line: str) -> None:
>          """
>          Process a line of documentation text in an additional section.

> @@ -692,7 +704,11 @@ def _append_various_line(self, line):

>          self._append_freeform(line)

> -    def _start_symbol_section(self, symbols_dict, name, indent):
> +    def _start_symbol_section(
> +            self,
> +            symbols_dict: Dict[str, 'QAPIDoc.ArgSection'],

The need to quote this within the very class that defines it is
annoying.

Yep. Python has weird scoping rules for nested classes. In my own hobby code I've largely started avoiding ever using them, because they don't seem to be worth the trouble.

Long story short: the name isn't available yet because at the time this annotation is evaluated, we're still in the middle of defining it. I wish there was a syntax for $thisclass for a nascent self-reference, but alas.

(Yes, I dislike it too. Nothing I can do about it, to my knowledge.)
 

> +            name: str,
> +            indent: int) -> None:
>          # FIXME invalid names other than the empty string aren't flagged
>          if not name:
>              raise QAPIParseError(self._parser, "invalid parameter name")
> @@ -704,13 +720,14 @@ def _start_symbol_section(self, symbols_dict, name, indent):
>          self._section = QAPIDoc.ArgSection(self._parser, name, indent)
>          symbols_dict[name] = self._section

> -    def _start_args_section(self, name, indent):
> +    def _start_args_section(self, name: str, indent: int) -> None:
>          self._start_symbol_section(self.args, name, indent)

> -    def _start_features_section(self, name, indent):
> +    def _start_features_section(self, name: str, indent: int) -> None:
>          self._start_symbol_section(self.features, name, indent)

> -    def _start_section(self, name=None, indent=0):
> +    def _start_section(self, name: Optional[str] = None,
> +                       indent: int = 0) -> None:
>          if name in ('Returns', 'Since') and self.has_section(name):
>              raise QAPIParseError(self._parser,
>                                   "duplicated '%s' section" % name)
> @@ -718,7 +735,7 @@ def _start_section(self, name=None, indent=0):
>          self._section = QAPIDoc.Section(self._parser, name, indent)
>          self.sections.append(self._section)

> -    def _end_section(self):
> +    def _end_section(self) -> None:
>          if self._section:
>              text = self._section.text = self._section.text.strip()
>              if self._section.name and (not text or text.isspace()):
> @@ -727,7 +744,7 @@ def _end_section(self):
>                      "empty doc section '%s'" % self._section.name)
>              self._section = QAPIDoc.Section(self._parser)

> -    def _append_freeform(self, line):
> +    def _append_freeform(self, line: str) -> None:
>          match = re.match(r'(@\S+:)', line)
>          if match:
>              raise QAPIParseError(self._parser,
> @@ -735,28 +752,30 @@ def _append_freeform(self, line):
>                                   % match.group(1))
>          self._section.append(line)

> -    def connect_member(self, member):
> +    def connect_member(self, member: 'QAPISchemaMember') -> None:
>          if member.name not in self.args:
>              # Undocumented TODO outlaw
>              self.args[member.name] = QAPIDoc.ArgSection(self._parser,
>                                                          member.name)
>          self.args[member.name].connect(member)

> -    def connect_feature(self, feature):
> +    def connect_feature(self, feature: 'QAPISchemaFeature') -> None:
>          if feature.name not in self.features:
>              raise QAPISemError(feature.info,
>                                 "feature '%s' lacks documentation"
>                                 % feature.name)
>          self.features[feature.name].connect(feature)

> -    def check_expr(self, expr):
> +    def check_expr(self, expr: TopLevelExpr) -> None:
>          if self.has_section('Returns') and 'command' not in expr:
>              raise QAPISemError(self.info,
>                                 "'Returns:' is only valid for commands")

> -    def check(self):
> +    def check(self) -> None:

> -        def check_args_section(args, what):
> +        def check_args_section(
> +                args: Dict[str, QAPIDoc.ArgSection], what: str
> +        ) -> None:

This is the fourth use of Dict[str, QAPIDoc.ArgSection].  Still fine,
but if we acquire even more, we should consider giving it a name.


If they share something in common that makes it name-able, sure. For now: eh?
 
>              bogus = [name for name, section in args.items()
>                       if not section.member]
>              if bogus:


reply via email to

[Prev in Thread] Current Thread [Next in Thread]