qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: acceptance-system-fedora failures


From: Philippe Mathieu-Daudé
Subject: Re: acceptance-system-fedora failures
Date: Tue, 13 Oct 2020 10:57:04 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1

On 10/9/20 12:37 PM, Pavel Dovgalyuk wrote:
On 08.10.2020 14:50, Kevin Wolf wrote:
Am 08.10.2020 um 12:26 hat Philippe Mathieu-Daudé geschrieben:
On 10/7/20 3:11 PM, Pavel Dovgalyuk wrote:
On 07.10.2020 15:49, Philippe Mathieu-Daudé wrote:
On 10/7/20 2:20 PM, Pavel Dovgalyuk wrote:
On 07.10.2020 14:22, Alex Bennée wrote:

Philippe Mathieu-Daudé <philmd@redhat.com> writes:

On 10/7/20 10:51 AM, Pavel Dovgalyuk wrote:
On 07.10.2020 11:23, Thomas Huth wrote:
On 07/10/2020 09.13, Philippe Mathieu-Daudé wrote:
Thanks, that was helpful. ... and the winner is:

       commit   55adb3c45620c31f29978f209e2a44a08d34e2da
       Author:  John Snow <jsnow@redhat.com>
       Date:    Fri Jul 24 01:23:00 2020 -0400
       Subject: ide: cancel pending callbacks on SRST

... starting with this commit, the tests starts failing. John, any
idea what
might be causing this?

This patch includes the following lines:

+        aio_bh_schedule_oneshot(qemu_get_aio_context(),
+                               ide_bus_perform_srst, bus);

replay_bh_schedule_oneshot_event should be used instead of this
function, because it synchronizes non-deterministic BHs.

Why do we have 2 different functions? BH are already complex
enough, and we need to also think about the replay API...

What about the other cases such vhost-user (blk/net), virtio-blk?

This does seem like something that should be wrapped up inside
aio_bh_schedule_oneshot itself or maybe we need a
aio_bh_schedule_transaction_oneshot to distinguish it from the other
uses the function has.


Maybe there should be two functions:
- one for the guest modification

aio_bh_schedule_oneshot_deterministic()?

- one for internal qemu things

Not sure why there is a difference, BH are used to
avoid delaying the guest, so there always something
related to "guest modification".

Not exactly. At least there is one non-related-to-guest case
in monitor_init_qmp:
        /*
         * We can't call qemu_chr_fe_set_handlers() directly here
         * since chardev might be running in the monitor I/O
         * thread.  Schedule a bottom half.
         */
       aio_bh_schedule_oneshot(iothread_get_aio_context(mon_iothread),                                monitor_qmp_setup_handlers_bh, mon);

I don't understand the documentation in docs/devel/replay.txt:

---
Bottom halves
=============

Bottom half callbacks, that affect the guest state, should be invoked
through
replay_bh_schedule_event or replay_bh_schedule_oneshot_event functions.
Their invocations are saved in record mode and synchronized with the
existing
log in replay mode.
---

But then it is only used in block drivers, which are not
related to guest state:

Pavel can tell you the details, but I think the idea was that you need
to use this function not when the code calling it modifies guest state,
but when the BH implementation can do so.

In the case of generic callbacks like provided by the blk_aio_*()
functions, we don't know whether this is the case, but it's generally
device emulation code, so chances are relatively high that they do.

I seem to remember that when reviewing the code that introduced
replay_bh_schedule_event(), I was relatively sure that we didn't catch
all necessary instances, but since it worked for Pavel and I didn't feel
like getting too involved with replay code, we just merged it anyway.

That's right.
Block layer does not touch guest by itself.
But it includes callbacks that may invoke interrupts on finishing disk operations. That is why we synchronize these callbacks with vCPU through the replay layer.

Instead having to remember to use replay_bh_schedule_event when
guest state is modified else the code is buggy, what about expecting
replay used everywhere, and disabling its use when we know guest state
is not modified?


Pavel Dovgalyuk





reply via email to

[Prev in Thread] Current Thread [Next in Thread]