qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()


From: Peter Maydell
Subject: [Qemu-devel] racing between pause_all_vcpus() and qemu_cpu_stop()
Date: Mon, 1 Oct 2018 18:03:30 +0100

I've been investigating a race condition where sometimes when my
guest writes to a device register which triggers a
qemu_system_reset_request(), it doesn't actually cause a clean reset,
but instead the guest CPU continues to execute instructions.
I managed to repro it under 'rr', which let me walk through enough
of what was going on to determine the following:

When a guest CPU thread calls qemu_system_reset_request(), this
results in a call to qemu_cpu_stop(current_cpu, true), to
make the CPU come back out to the main loop. We also set the
reset_requested flag, to get the IO thread to actually do the
reset.

The main loop thread runs main_loop_should_exit(). If there is a
pending reset, it calls pause_all_vcpus(), with the intention
that this quiesces all the guest CPUs before it starts messing
with reset actions.

pause_all_vcpus() just waits for every cpu to have cpu->stopped set.
However, if the running cpu has just called qemu_cpu_stop() on
itself then it will have set cpu->stopped true but not actually
made it out to the main loop yet. (In the case I'm looking at,
what happens is that as soon as the CPU thread unlocks the
iothread mutex in io_writex() after the device write, the
main thread runs and does all the reset operations.)

The reset code in the iothread then proceeds to start calling
various reset functions while the CPU thread is still inside
the exec loop, running generated code and so on. This doesn't
seem like what ought to happen. In particular it includes
calling cpu_common_reset(), which clears all kinds of flags
relevant to the still-executing CPU...

Any suggestions for how we should fix this?

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]