[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs th
From: |
Nicholas Piggin |
Subject: |
Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state |
Date: |
Thu, 15 Aug 2024 17:12:32 +1000 |
On Thu Aug 15, 2024 at 3:25 AM AEST, Alex Bennée wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
>
> > On Wed, Aug 14, 2024 at 04:05:34PM +1000, Nicholas Piggin wrote:
> >> On Wed Aug 14, 2024 at 6:48 AM AEST, Michael S. Tsirkin wrote:
> >> > On Tue, Aug 13, 2024 at 09:23:24PM +0100, Alex Bennée wrote:
> >> > > From: Nicholas Piggin <npiggin@gmail.com>
> >> > >
> >> > > The regular qemu_bh_schedule() calls result in non-deterministic
> >> > > execution of the bh in record-replay mode, which causes replay failure.
> >> > >
> >> > > Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> >> > > Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
> >> > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> >> > > Message-Id: <20240813050638.446172-9-npiggin@gmail.com>
> >> > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> > > ---
> >> > > hw/net/virtio-net.c | 11 ++++++-----
> >> > > 1 file changed, 6 insertions(+), 5 deletions(-)
> >> > >
> >> > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> >> > > index 08aa0b65e3..10ebaae5e2 100644
> >> > > --- a/hw/net/virtio-net.c
> >> > > +++ b/hw/net/virtio-net.c
> >> > > @@ -40,6 +40,7 @@
> >> > > #include "migration/misc.h"
> >> > > #include "standard-headers/linux/ethtool.h"
> >> > > #include "sysemu/sysemu.h"
> >> > > +#include "sysemu/replay.h"
> >> > > #include "trace.h"
> >> > > #include "monitor/qdev.h"
> >> > > #include "monitor/monitor.h"
> >> > > @@ -417,7 +418,7 @@ static void virtio_net_set_status(struct
> >> > > VirtIODevice *vdev, uint8_t status)
> >> > > timer_mod(q->tx_timer,
> >> > > qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL)
> >> > > + n->tx_timeout);
> >> > > } else {
> >> > > - qemu_bh_schedule(q->tx_bh);
> >> > > + replay_bh_schedule_event(q->tx_bh);
> >> > > }
> >> > > } else {
> >> > > if (q->tx_timer) {
> >> > > @@ -2672,7 +2673,7 @@ static void
> >> > > virtio_net_tx_complete(NetClientState *nc, ssize_t len)
> >> > > */
> >> > > virtio_queue_set_notification(q->tx_vq, 0);
> >> > > if (q->tx_bh) {
> >> > > - qemu_bh_schedule(q->tx_bh);
> >> > > + replay_bh_schedule_event(q->tx_bh);
> >> > > } else {
> >> > > timer_mod(q->tx_timer,
> >> > > qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
> >> > > n->tx_timeout);
> >> > > @@ -2838,7 +2839,7 @@ static void virtio_net_handle_tx_bh(VirtIODevice
> >> > > *vdev, VirtQueue *vq)
> >> > > return;
> >> > > }
> >> > > virtio_queue_set_notification(vq, 0);
> >> > > - qemu_bh_schedule(q->tx_bh);
> >> > > + replay_bh_schedule_event(q->tx_bh);
> >> > > }
> >> > >
> >> > > static void virtio_net_tx_timer(void *opaque)
> >> > > @@ -2921,7 +2922,7 @@ static void virtio_net_tx_bh(void *opaque)
> >> > > /* If we flush a full burst of packets, assume there are
> >> > > * more coming and immediately reschedule */
> >> > > if (ret >= n->tx_burst) {
> >> > > - qemu_bh_schedule(q->tx_bh);
> >> > > + replay_bh_schedule_event(q->tx_bh);
> >> > > q->tx_waiting = 1;
> >> > > return;
> >> > > }
> >> > > @@ -2935,7 +2936,7 @@ static void virtio_net_tx_bh(void *opaque)
> >> > > return;
> >> > > } else if (ret > 0) {
> >> > > virtio_queue_set_notification(q->tx_vq, 0);
> >> > > - qemu_bh_schedule(q->tx_bh);
> >> > > + replay_bh_schedule_event(q->tx_bh);
> >> > > q->tx_waiting = 1;
> >> > > }
> >> > > }
> >> > > --
> >> > > 2.39.2
> >> >
> >> >
> >> > Is this really the only way to fix this? I do not think
> >> > virtio has any business knowing about replay.
> >> > What does this API do, even? BH but not broken with replay?
> >> > Do we ever want replay broken? Why not fix qemu_bh_schedule?
> >> > And when we add another feature which we do not want to break
> >> > will we do foo_bar_replay_bh_schedule_event or what?
> >>
> >> I agree with you. We need to do this (a couple of other hw
> >> subsystems already do and likely some are still broken vs
> >> replay and would need to be converted), but I think it's
> >> mostly a case of bad naming. You're right the caller should
> >> not know about replay at all, what it should be is whether
> >> the event is for the target machine or the host harness,
> >> same as timers are VIRTUAL / HOST.
> >> So I think we just need to make a qemu_bh_schedule_<type>,
> >> or qemu_bh_scheudle_event(... QEMU_EVENT_VIRTUAL/HOST/etc).
> >
> > Or just pass QEMUClockType?
Could be a good idea. Although I'm not sure what to do with
all types, maybe we can restrict what is supported.
> Is this wider re-factoring something that can wait for the next
> developer cycle?
I would say so. It's not quite trivial to do nicely since
things are a bit tangled between util/async and replay.
> >> I had started on a conversion once but not completed it.
> >> I could resurrect if there is agreement on the API?
>
> I would certainly welcome it being cleaned up. The supported replay
> devices are very piecemeal at the moment.
I'll tidy up and post an RFC for how the new API might look.
Thanks,
Nick
- [PATCH v2 07/21] target/i386: allow access_ptr to force slow path on failed probe, (continued)
- [PATCH v2 07/21] target/i386: allow access_ptr to force slow path on failed probe, Alex Bennée, 2024/08/13
- [PATCH v2 11/21] tests/avocado: excercise scripts/replay-dump.py in replay tests, Alex Bennée, 2024/08/13
- [PATCH v2 14/21] tests/avocado: replay_kernel.py add x86-64 q35 machine test, Alex Bennée, 2024/08/13
- [PATCH v2 12/21] replay: allow runstate shutdown->running when replaying trace, Alex Bennée, 2024/08/13
- [PATCH v2 13/21] Revert "replay: stop us hanging in rr_wait_io_event", Alex Bennée, 2024/08/13
- [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state, Alex Bennée, 2024/08/13
- Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state, Michael S. Tsirkin, 2024/08/13
- Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state, Nicholas Piggin, 2024/08/14
- Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state, Michael S. Tsirkin, 2024/08/14
- Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state, Alex Bennée, 2024/08/14
- Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state,
Nicholas Piggin <=
- Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state, Michael S. Tsirkin, 2024/08/15
- Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state, Nicholas Piggin, 2024/08/15
- Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state, Jason Wang, 2024/08/15
- Re: [PATCH v2 16/21] virtio-net: Use replay_schedule_bh_event for bhs that affect machine state, Alex Bennée, 2024/08/16
[PATCH v2 15/21] chardev: set record/replay on the base device of a muxed device, Alex Bennée, 2024/08/13
[PATCH v2 21/21] plugins: fix race condition with scoreboards, Alex Bennée, 2024/08/13
[PATCH v2 10/21] scripts/replay-dump.py: rejig decoders in event number order, Alex Bennée, 2024/08/13
[PATCH v2 17/21] virtio-net: Use virtual time for RSC timers, Alex Bennée, 2024/08/13
[PATCH v2 19/21] docs: Fix some typos (found by typos) and grammar issues, Alex Bennée, 2024/08/13