qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 03/16] qapi/expr.py: constrain incoming expression types


From: John Snow
Subject: Re: [PATCH v3 03/16] qapi/expr.py: constrain incoming expression types
Date: Thu, 25 Feb 2021 15:43:50 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0

On 2/25/21 6:56 AM, Markus Armbruster wrote:
John Snow <jsnow@redhat.com> writes:

On 2/24/21 5:01 AM, Markus Armbruster wrote:
John Snow <jsnow@redhat.com> writes:

mypy does not know the types of values stored in Dicts that masquerade
as objects. Help the type checker out by constraining the type.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
---
   scripts/qapi/expr.py | 25 ++++++++++++++++++++++---
   1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
index 5694c501fa3..783282b53ce 100644
--- a/scripts/qapi/expr.py
+++ b/scripts/qapi/expr.py
@@ -15,9 +15,17 @@
   # See the COPYING file in the top-level directory.
import re
+from typing import MutableMapping, Optional
from .common import c_name
   from .error import QAPISemError
+from .parser import QAPIDoc
+from .source import QAPISourceInfo
+
+
+# Expressions in their raw form are JSON-like structures with arbitrary forms.
+# Minimally, their top-level form must be a mapping of strings to values.
+Expression = MutableMapping[str, object]

MutableMapping, fancy.  It's only ever dict.  Why abstract from that?

OrderedDict, actually.  MutableMapping is misleading, because it doesn't
specify "orderedness".


Yeah, I am realizing that Dict helps imply that constraint on 3.6+ but that MutableMapping doesn't.

I am worried about how hard it's gonna hurt when I remember why I wanted MutableMapping.

>:|

For now, I'll go back to Dict.

I don't know! I referenced this in the cover letter. I cannot remember
the reason anymore. It had R-Bs on it so I left it alone.

There are some differences, but I no longer remember why I thought they
applied. Maybe some of my more exploratory work wanted it. Dunno.

Happens.  It's a long patch queue you're trying to flush.

The use of object is again owed to mypy's inability to do recursive
types.  What we really have here is something like

     Expression = Union[bool, str, dict[str, Expression], list[Expression]]

with the root further constrained to the Union's dict branch.  Spell
that out in a bit more detail, like you did in introspect.py?


Terminology problem?

I am using "Expression" to mean very specifically a top-level object as
returned from parser.py, which *must* be an Object, so it *must* be a
mapping of str => yaddayadda.

Aha!

We'll talk some more about naming of type aliases in review of PATCH 08.

The type as I intended it is Expression = Dict[str, yaddayadda]

where yaddayadda is
Union[int, str, bool, List[yaddayadda], Dict[str, yaddayadda]]

Yes.

As qapi-code-gen.txt explains, we have two layers of syntax:

* The bottom layer is (heavily bastardized) JSON.  qapi-code-gen.txt
   specifies it by listing the differences to RFC 8259.  parser.py parses
   it into abstract syntax trees.


Aside: A new realization about a deviation from JSON: objects are inherently unordered collections.

* The upper layer recognizes the abstract syntax trees that are valid as
   QAPI schema.  qapi-code-gen.txt specifies it with a context-free
   grammar.  expr.py checks the ASTs against that grammar.  It also
   expands shorthand forms into longhand.

Detail not documented in qapi-code-gen.txt: parser.py rejects non-object
at the top-level, so expr.py doesn't have to.


Yep.

expr.py is what validates the yaddayadda, so there's no point in trying
to type it further, I think.

If mypy could do recursive types, typing it further would be a
no-brainer: just state what is.

Since it can't, we need to stop typing / start cheating at some point.
Where exactly is not obvious.  Your idea is at least as good as mine.

Probably worth a better comment.

Yes :)


I'll look at Patch 8 and then revisit, but I will attempt to make a better comment. I think there are bits of part 5 that makes it a bit more obvious, because I create a Real Type :tm: to pass each "Expression" as a whole over to to expr.py.

(This is just kind of an intermediate form and as such it's not necessarily something I gave tremendous thought to.)

Hmm, there you used Any, not object.  I guess that's because mypy lets
you get away with object here, but not there.  Correct?


Yep. I can get away with the stricter type here because of how we use
it, so I did. That's an artifact of it not being recursive and how
expr.py's entire raison d'etre is using isinstance() checks to
effectively downcast for us everywhere already.

Also, PEP 8 comment line length.


Augh.

Is there a way to set emacs mode highlighting in Python such that it
highlights when I run past the 72-col margin, but only for comments?

I have the general-purpose highlighter on for the 80-col margin.

Got a .emacs snippet for me?


I use only these bits:

 ;; Reflow-width defaults to 72.
 '(fill-column 72)

 ;; Highlight past column 80
 '(whitespace-line-column 80)

 ;; Theme whitespace highlighting as such:
 '(whitespace-style '(face empty tabs lines-tail trailing))

 ;; Don't insert tabs for spaces
 '(indent-tabs-mode nil)

I'm not familiar with any setting like this for any of the linters or
pycharm right away either.

Hmm, ... okay, TIL from pycodestyle(1):

             --max-line-length=n  set maximum allowed line length (default: 79)
             --max-doc-length=n   set maximum allowed doc line length and 
perform these
                                  checks (unchecked if not set)

Let me know whether --max-doc-length=72 fits the bill.


It does.

I'd need to send a fixup patch in order to enable it, but I am not thrilled with the idea of having to squabble with you over how to break lines that are just barely overlong.

Least annoying for me: I write a draft patch to get the flake8 baseline for local testing, you copy-edit the patch to your stylistic liking, I'll ACK the edits.


# Names must be letters, numbers, -, and _. They must start with letter,
@@ -287,9 +295,20 @@ def check_event(expr, info):
def check_exprs(exprs):
       for expr_elem in exprs:
-        expr = expr_elem['expr']
-        info = expr_elem['info']
-        doc = expr_elem.get('doc')
+        # Expression
+        assert isinstance(expr_elem['expr'], dict)

Must be an *ordered* mapping, actually.  It's only ever OrderedDict.


Allegedly. Lawsuit pending appeal.

+        for key in expr_elem['expr'].keys():
+            assert isinstance(key, str)
+        expr: Expression = expr_elem['expr']

*Unchecked* way to tell the type checker (I think):

              expr = cast(Expression, expr_elem['expr']

I like checking in general, but is it worth the bother here?


It all goes away in the first half of part 5, where I create an Expression object that has typed fields for its components, and mypy's static checker does the rest of the lifting.

Could do casts, yeah, but I suppose I liked the assert to let me know that the types I saw on the back of my eyelids were the real ones.


You're fine with repeating exp_elem['expr'] here, and ...

+
+        # QAPISourceInfo
+        assert isinstance(expr_elem['info'], QAPISourceInfo)
+        info: QAPISourceInfo = expr_elem['info']

... expr_elem['info'] here, but ...

+
+        # Optional[QAPIDoc]
+        tmp = expr_elem.get('doc')
+        assert tmp is None or isinstance(tmp, QAPIDoc)
+        doc: Optional[QAPIDoc] = tmp

... you avoid repetition of expr_elem.get('doc') here.  Any particular
reason?


Because this looks like garbage written by a drunkard:

assert expr_elem.get('doc') is None or isinstance(expr_elem.get('doc'),
QAPIDoc)
doc: Optional[QAPIDoc] = expr_elem.get('doc')

Unchecked way:

              doc = cast(Optional[QAPIDoc], expr_elem.get('doc'))

if 'include' in expr:
               continue

I'll see if I can clean it up a little. I will take your suggestion of casts to mean that you'd be okay with actually not checking the form at runtime.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]