[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] cpus.c: Fix race condition in cpu_stop_current(
From: |
Alex Bennée |
Subject: |
Re: [Qemu-devel] [PATCH] cpus.c: Fix race condition in cpu_stop_current() |
Date: |
Mon, 10 Dec 2018 11:05:26 +0000 |
User-agent: |
mu4e 1.1.0; emacs 26.1.90 |
Peter Maydell <address@hidden> writes:
> We use cpu_stop_current() to ensure the current CPU has stopped
> from places like qemu_system_reset_request(). Unfortunately its
> current implementation has a race. It calls qemu_cpu_stop(),
> which sets cpu->stopped to true even though the CPU hasn't
> actually stopped yet. The main thread will look at the flags
> set by qemu_system_reset_request() and call pause_all_vcpus().
> pause_all_vcpus() waits for every cpu to have cpu->stopped true,
> so it can continue (and we will start the system reset operation)
> before the vcpu thread has got back to its top level loop.
>
> Instead, just set cpu->stop and call cpu_exit(). This will
> cause the vcpu to exit back to the top level loop, and there
> (as part of the wait_io_event code) it will call qemu_cpu_stop().
>
> This fixes bugs where the reset request appeared to be ignored
> or the CPU misbehaved because the reset operation started
> to change vcpu state while the vcpu thread was still using it.
>
> Signed-off-by: Peter Maydell <address@hidden>
> ---
> We discussed this a little while back:
> https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg00154.html
> and Jaap reported a bug which I suspect of being the same thing:
> https://lists.gnu.org/archive/html/qemu-discuss/2018-10/msg00014.html
>
> Annoyingly I have lost the test case that demonstrated this
> race, but I analysed it at the time and this should definitely
> fix it. I have opted not to try to address any of the other
> possible cleanup here (eg vm_stop() has a potential similar
> race if called from a vcpu thread I suspect), since it gets
> pretty tangled.
>
> Jaap: could you test whether this patch fixes the issue you
> were seeing, please?
> ---
> cpus.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/cpus.c b/cpus.c
> index 0ddeeefc14f..b09b7027126 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -2100,7 +2100,8 @@ void qemu_init_vcpu(CPUState *cpu)
> void cpu_stop_current(void)
> {
> if (current_cpu) {
> - qemu_cpu_stop(current_cpu, true);
> + current_cpu->stop = true;
> + cpu_exit(current_cpu);
Should the FIXME in vm_stop also be fixed?
/*
* FIXME: should not return to device code in case
* vm_stop() has been requested.
*/
cpu_stop_current();
return 0;
> }
> }
--
Alex Bennée