[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] replay: wake up vCPU when replaying
From: |
Pavel Dovgalyuk |
Subject: |
Re: [Qemu-devel] [PATCH] replay: wake up vCPU when replaying |
Date: |
Mon, 9 Jul 2018 14:24:59 +0300 |
There are some situations when this patch still doesn't help.
I think this happens due to the race condition in qemu_tcg_rr_wait_io_event
static void qemu_tcg_rr_wait_io_event(CPUState *cpu)
{
while (all_cpu_threads_idle()) {
stop_tcg_kick_timer();
qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
}
start_tcg_kick_timer();
qemu_wait_io_event_common(cpu);
}
all_cpu_threads_idle() returns true when there is no queued work.
But between this call and qemu_cond_wait() iothread may add queued work
and the vCPU thread will sleep infinitely.
Does anyone have an idea how to fix this?
Pavel Dovgalyuk
> -----Original Message-----
> From: Pavel Dovgalyuk [mailto:address@hidden
> Sent: Tuesday, July 03, 2018 11:53 AM
> To: address@hidden
> Cc: address@hidden; address@hidden; address@hidden;
> address@hidden; address@hidden
> Subject: [PATCH] replay: wake up vCPU when replaying
>
> In record/replay icount mode vCPU thread and iothread synchronize
> the execution using the checkpoints.
> vCPU thread processes the virtual timers and iothread processes all others.
> When iothread wants to wake up sleeping vCPU thread, it sends dummy queued
> work. Therefore it could be the following sequence of the events in
> record mode:
> - IO: sending dummy work
> - IO: processing timers
> - CPU: wakeup
> - CPU: clearing dummy work
> - CPU: processing virtual timers
>
> But due to the races in replay mode the sequence may change:
> - IO: sending dummy work
> - CPU: wakeup
> - CPU: clearing dummy work
> - CPU: sleeping again because nothing to do
> - IO: Processing timers
> - CPU: zzzz
>
> In this case vCPU will not wake up, because dummy work is not to be set up
> again.
>
> This patch tries to wake up the vCPU when it sleeps and the icount warp
> checkpoint isn't met. It means that vCPU has something to do, because
> there are no other reasons of non-matching warp checkpoint.
>
> Signed-off-by: Pavel Dovgalyuk <address@hidden>
> ---
> cpus.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/cpus.c b/cpus.c
> index 181ce33..bad6a33 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -539,11 +539,6 @@ void qemu_start_warp_timer(void)
> return;
> }
>
> - /* warp clock deterministically in record/replay mode */
> - if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_START)) {
> - return;
> - }
> -
> if (!all_cpu_threads_idle()) {
> return;
> }
> @@ -553,6 +548,16 @@ void qemu_start_warp_timer(void)
> return;
> }
>
> + /* warp clock deterministically in record/replay mode */
> + if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_START)) {
> + /* vCPU is sleeping and warp can't be started.
> + It is probably a race condition: notification sent
> + to vCPU was processed in advance and vCPU went to sleep.
> + Therefore we have to wake it up for doing someting. */
> + qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
> + return;
> + }
> +
> /* We want to use the earliest deadline from ALL vm_clocks */
> clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT);
> deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);