[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3 0/4] Introduce attributes for timers subsyste
From: |
Pavel Dovgalyuk |
Subject: |
Re: [Qemu-devel] [PATCH v3 0/4] Introduce attributes for timers subsystem and remove QEMU_CLOCK_VIRTUAL_EXT clock type |
Date: |
Thu, 10 Jan 2019 16:30:30 +0300 |
Hi,
It seems, that this approach is not always correct.
Now timerlist_deadline_ns uses all virtual timers for deadline calculation
(including external
ones).
qemu_start_warp_timer uses the deadline for setting warp timer (which should be
deterministic).
Therefore warp timer may become non-deterministic and replay may behave
differently compared to
recording phase.
We have to rollback these or improve somehow to avoid non-determinism.
Pavel Dovgalyuk
> -----Original Message-----
> From: Artem Pisarenko [mailto:address@hidden
> Sent: Thursday, October 18, 2018 2:04 PM
> To: address@hidden
> Cc: Pavel Dovgalyuk; Paolo Bonzini; Artem Pisarenko
> Subject: [PATCH v3 0/4] Introduce attributes for timers subsystem and remove
> QEMU_CLOCK_VIRTUAL_EXT clock type
>
> Recent patches from series [PATCH v6] "Fixing record/replay and adding
> reverse debugging"
> introduced new clock type QEMU_CLOCK_VIRTUAL_EXT and replaced virtual timers
> in some external
> subsystems with it.
> This resulted in small change to existing behavior, which I consider to be
> unacceptable.
> Processing of virtual timers, belonging to new clock type, was kicked off to
> main loop, which
> made them asynchronous with vCPU thread and, in icount mode, with whole guest
> execution. This
> breaks expected determinism in non-record/replay icount mode of emulation
> where these
> "external subsystems" are isolated from host (i.e. they external only to
> guest core, not to
> emulation environment).
>
> Example for slirp ("user" backend for network device):
> User runs qemu in icount mode with rtc clock=vm without any external
> communication interfaces
> but with "-netdev user,restrict=on". It expects deterministic execution,
> because network
> services are emulated inside qemu and isolated from host. There are no
> reasons to get reply
> from DHCP server with different delay or something like that.
>
> These series of patches revert those commits and reimplement their
> modifications in a better
> way.
>
> Current implementation of timers/clock processing is confusing (at least for
> me) because of
> exceptions from design concept behind them, which already introduced by
> icount mode (which
> adds QEMU_CLOCK_VIRTUAL_RT). Adding QEMU_CLOCK_VIRTUAL_EXT just made things
> even more
> complicated. I consider these "alternative" virtual clocks to be some kind of
> hacks being
> convinient only to authors of relevant qemu features. Lets don't touch
> fundamental clock types
> and keep them orthogonal to special cases of timers handling.
>
> As far as I understand, original intention of author was just to make
> optimization in replay
> log by avoiding storing extra events which don't change guest state directly.
> Indeed, for
> example, ipv6 icmp timer in slirp does things which external to guest core
> and ends with
> sending network packet to guest, but record/replay will anyway catch event,
> corresponding to
> packet reception in guest network frontend, and store it to replay log, so
> there are no need
> in making checkpoint for corresponding clock when that timer fires.
> If so, then we just need to skip checkpoints for clock values, when only
> these specific timers
> are run. It is individual timers which are specific, not clock.
> Adding some kind of attribute/flag/property to individual timer allows any
> special qemu
> feature (such as record/replay) to inspect it and handle as needed. This
> design achieves less
> dependencies, more transparency and has more intuitive and clear sense. For
> record/replay
> feature it's achieved with 'EXTERNAL' attribute. The only drawback is that it
> required to add
> one extra mutex unlock/lock pair for virtual clock type in
> timerlist_run_timers() function
> (see patch 3), but it's being added only when qemu runs in record/replay mode.
>
> Finally, this patch series optimizes checkpointing for all other clocks it
> applies to.
>
>
> v3 changes:
> - fixed failed build in last patch
> - attributes has been refactored and restricted to be used only within
> qemu-timer (as
> suggested by Stefan Hajnoczi and Paolo Bonzini)
>
> v2 changes:
> - timers/aio interface refactored and improved, addition to couroutines
> removed (as Paolo
> Bonzini suggested)
> - fixed wrong reimplementation in qemu-timer.c caused race conditions
> - added bonus patch which optimizes checkpointing for other clocks
>
> P.S. I've tried to test record/replay with slirp, but in replay mode qemu
> stucks at guest
> linux boot after "Configuring network interfaces..." message, where DHCP
> communication takes
> place. It's broken in a same way both in master and master with reverted
> commits being fixed.
>
> P.S.2. I wasn't able to test these patches on purely clean master branch
> because of bugs
> https://bugs.launchpad.net/qemu/+bug/1790460 and
> https://bugs.launchpad.net/qemu/+bug/1795369,
> which workarounded by several not-accepted (yet?) modifications.
>
>
> Artem Pisarenko (4):
> Revert some patches from recent [PATCH v6] "Fixing record/replay and
> adding reverse debugging"
> Introduce attributes to qemu timer subsystem
> Restores record/replay behavior related to special virtual clock
> processing for timers used in external subsystems.
> Optimize record/replay checkpointing for all clocks it applies to
>
> include/block/aio.h | 59 ++++++++++++++++++---
> include/qemu/timer.h | 128
> +++++++++++++++++++++++-----------------------
> slirp/ip6_icmp.c | 9 ++--
> tests/ptimer-test-stubs.c | 13 +++--
> ui/input.c | 9 ++--
> util/qemu-timer.c | 95 +++++++++++++++++++++++-----------
> 6 files changed, 201 insertions(+), 112 deletions(-)
>
> --
> 2.7.4
- Re: [Qemu-devel] [PATCH v3 0/4] Introduce attributes for timers subsystem and remove QEMU_CLOCK_VIRTUAL_EXT clock type,
Pavel Dovgalyuk <=