qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 12/15] iotests: Disable AQMP logging under non-debug modes


From: Hanna Reitz
Subject: Re: [PATCH 12/15] iotests: Disable AQMP logging under non-debug modes
Date: Tue, 5 Oct 2021 17:12:02 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 04.10.21 20:32, John Snow wrote:


On Mon, Oct 4, 2021 at 6:12 AM Hanna Reitz <hreitz@redhat.com <mailto:hreitz@redhat.com>> wrote:

    On 18.09.21 04:14, John Snow wrote:
    >
    >
    > On Fri, Sep 17, 2021 at 8:58 PM John Snow <jsnow@redhat.com
    <mailto:jsnow@redhat.com>
    > <mailto:jsnow@redhat.com <mailto:jsnow@redhat.com>>> wrote:
    >
    >
    >
    >     On Fri, Sep 17, 2021 at 10:30 AM Hanna Reitz
    <hreitz@redhat.com <mailto:hreitz@redhat.com>
    >     <mailto:hreitz@redhat.com <mailto:hreitz@redhat.com>>> wrote:
    >
    >         On 17.09.21 07:40, John Snow wrote:
    >         > Disable the aqmp logger, which likes to (at the moment)
    >         print out
    >         > intermediate warnings and errors that cause session
    >         termination; disable
    >         > them so they don't interfere with the job output.
    >         >
    >         > Leave any "CRITICAL" warnings enabled though, those
    are ones
    >         that we
    >         > should never see, no matter what.
    >
    >         I mean, looks OK to me, but from what I understand (i.e.
    little),
    >         qmp_client doesn’t log CRITICAL messages, at least I
    can’t see
    >         any. Only
    >         ERRORs.
    >
    >
    >     There's *one* critical message in protocol.py, used for a
    >     circumstance that I *think* should be impossible. I do not
    think I
    >     currently use any WARNING level statements.
    >
    >         I guess I’m missing some CRITICAL messages in external
    >         functions called
    >         from qmp_client.py, but shouldn’t we still keep ERRORs?
    >
    >
    >     ...Mayyyyyybe?
    >
    >     The errors logged by AQMP are *almost always* raised as
    Exceptions
    >     somewhere else, eventually. Sometimes when we encounter them in
    >     one context, we need to save them and then re-raise them in a
    >     different execution context. There's one good exception to this:
    >     My pal, EOFError.
    >
    >     If the reader context encounters EOF, it raises EOFError and
    this
    >     causes a disconnect to be scheduled asynchronously. *Any*
    >     Exception that causes a disconnect to be scheduled
    asynchronously
    >     is dutifully logged as an ERROR. At this point in the code, we
    >     don't really know if the user of the library considers this an
    >     "error" yet or not. I've waffled a lot on how exactly to treat
    >     this circumstance. ...Hm, I guess that's really the only case
    >     where I have an error that really ought to be suppressed. I
    >     suppose what I will do here is: if the exception happens to
    be an
    >     EOFError I will drop the severity of the log message down to
    INFO.
    >     I don't know why it takes being challenged on this stuff to
    start
    >     thinking clearly about it, but here we are. Thank you for your
    >     feedback :~)
    >
    >     --js
    >
    >
    > Oh, CI testing reminds me of why I am a liar here.
    >
    > the mirror-top-perms test intentionally expects not to be able to
    > connect, but we're treated to these two additional lines of output:
    >
    > +ERROR:qemu.aqmp.qmp_client.qemub-2536319:Negotiation failed:
    EOFError
    > +ERROR:qemu.aqmp.qmp_client.qemub-2536319:Failed to establish
    session:
    > EOFError
    >
    > Uh. I guess a temporary suppression in mirror-top-perms, then ...?

    Sounds right to me, if that’s simple enough.

    (By the way, I understand it right that you want to lower the
    severity
    of EOFErrors to INFO only on disconnect, right?  Which is why they’re
    still logged as ERRORs here, because they aren’t occurring on
    disconnects?)


More or less, yeah.

When an EOFError causes the reader coroutine to halt (because it can't read the next message), I decided (in v2) to drop that one particular logging message down to "INFO", because it might -- or might not be -- an expected occurrence from the point of view of whoever is managing the QMP connection. Maybe it was expected (The test used qemu-guest-agent or something else to make the guest shutdown, taking QEMU down with it without the knowledge of the QMP library layer) or maybe it was unexpected (the QMP remote really just disappeared from us on a whim). There's no way to know, so it probably isn't right to consider it an error.

In the connection case, I left it as an ERROR because the caller asked us to connect to an endpoint and we were unable to, which feels unambiguous. It will be ultimately reported via Exceptions as qemu.aqmp.ConnectError, with additional information available in fields of that exception object. Even though the exception is reported to the caller, I decided to log the occurrence anyway, because I felt like it should be the job of the library to make a good log and not the caller's responsibility to catch the exception and then log it themselves.

That does leave us with this atypical case though: the caller is intentionally causing problems :)

(Well, atypical for a user of the library who is using it to make a tool they expect to work. Quite a typical case for tests in the abstract, though we only seem to have precisely one usage of this pattern in iotests so far.)

    > In most other cases I rather imagine we do want to see this kind of
    > output to help give more information about how things have
    failed --
    > they ARE errors. We just happen to not care about them right here.

    Well, in fact we do expect them here, so we could even log them, but
    first I don’t know whether they’re stable enough for that, and second
    this would break the “choose the AQMP or legacy QMP back-end via an
    environment variable” thing.


Yeah, that's the other half of it, I came to realize. Just logging the expected failure feels the most idiomatic, but it requires a new logging filter (I have to filter out the PID from the logger name to make it work consistently) and breaks the ability to switch.

So I suppose "for now" just disabling the logger output for a critical section and leaving a comment that states that once we're feeling confident about the new library (maybe after the next release when everything has really gone through the testing wash cycle) we can remove the suppression and just log the errors the normal way.

    > The only thing I don't exactly like about this is that it requires
    > some knowledge by the caller to know to disable it. It makes
    writing
    > negative tests a bit more annoying because the library leans so
    > heavily on yelling loudly when it encounters problems.

    Yeah.  I’m afraid I don’t have a good idea on how to convey test
    writers
    how to suppress these errors, even if it were a simple one-liner
    (like
    `self.vm_b.set_log_threshold(logging.CRITICAL)`)...

    We could start an “iotests tips and tricks” document, but do we
    want to?

    Hanna


Hm, not just yet. More chances for un-checked stuff to bitrot. There's only this one caller that poses a problem here, so it's probably not going to be too difficult to just update tests to expect the error logs.

OK.

V2 soon. I already respun it, but it's been a week, so I will need to scrub it down with a good proof-reading pass. Thanks for your patience, and I hope you enjoyed your PTO :~)

You know how it is.  Some of it was spent enjoying it, the rest was spent lamenting that I’m not enjoying it enough.

Hanna




reply via email to

[Prev in Thread] Current Thread [Next in Thread]