qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 11/13] cpu-exec-common: Introduce async_safe_


From: Emilio G. Cota
Subject: Re: [Qemu-devel] [PATCH v5 11/13] cpu-exec-common: Introduce async_safe_run_on_cpu()
Date: Tue, 2 Aug 2016 15:22:42 -0400
User-agent: Mutt/1.5.23 (2014-03-12)

On Tue, Aug 02, 2016 at 18:27:42 +0100, Alex Bennée wrote:
> From: Sergey Fedorov <address@hidden>
> 
> This patch is based on the ideas found in work of KONRAD Frederic [1],
> Alex Bennée [2], and Alvise Rigo [3].
> 
> This mechanism allows to perform an operation safely in a quiescent
> state. Quiescent state means: (1) no vCPU is running and (2) BQL in
> system-mode or 'exclusive_lock' in user-mode emulation is held while
> performing the operation. This functionality is required e.g. for
> performing translation buffer flush safely in multi-threaded user-mode
> emulation.
> 
> The existing CPU work queue is used to schedule such safe operations. A
> new 'safe' flag is added into struct qemu_work_item to designate the
> special requirements of the safe work. An operation in a quiescent sate

s/sate/state/

(snip)
> index a233f01..6d5da15 100644
> --- a/cpu-exec-common.c
> +++ b/cpu-exec-common.c
> @@ -25,6 +25,7 @@
>  
>  bool exit_request;
>  CPUState *tcg_current_cpu;
> +int tcg_pending_threads;
>  
>  /* exit the current TB, but without causing any exception to be raised */
>  void cpu_loop_exit_noexc(CPUState *cpu)
> @@ -79,6 +80,35 @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc)
>  }
>  
>  QemuCond qemu_work_cond;
> +QemuCond qemu_safe_work_cond;
> +QemuCond qemu_exclusive_cond;
> +
> +static int safe_work_pending;
> +
> +#ifdef CONFIG_USER_ONLY
> +#define can_wait_for_safe() (1)
> +#else
> +/*
> + * We never sleep in SoftMMU emulation because we would deadlock as
> + * all vCPUs are in the same thread. This will change for MTTCG
> + * however.
> + */
> +#define can_wait_for_safe() (0)
> +#endif
> +
> +void wait_safe_cpu_work(void)
> +{
> +    while (can_wait_for_safe() && atomic_mb_read(&safe_work_pending) > 0) {

The atomic here is puzzling, see below.

> +        /*
> +         * If there is pending safe work and no pending threads we
> +         * need to signal another thread to start its work.
> +         */
> +        if (tcg_pending_threads == 0) {
> +            qemu_cond_signal(&qemu_exclusive_cond);
> +        }
> +        qemu_cond_wait(&qemu_safe_work_cond, qemu_get_cpu_work_mutex());
> +    }
> +}
>  
>  static void queue_work_on_cpu(CPUState *cpu, struct qemu_work_item *wi)
>  {
> @@ -91,9 +121,18 @@ static void queue_work_on_cpu(CPUState *cpu, struct 
> qemu_work_item *wi)
>      cpu->queued_work_last = wi;
>      wi->next = NULL;
>      wi->done = false;
> +    if (wi->safe) {
> +        atomic_inc(&safe_work_pending);
> +    }

This doesn't seem right. Operating on the condvar's shared 'state' variable
should always be done with the condvar's mutex held. Otherwise, there's
no guarantee that sleepers will always see a consistent state when they're
woken up, which can easily lead to deadlock.

I suspect this is what caused the deadlock you saw in the last iteration
of the series.

An additional requirement is the fact that new CPUs can come anytime in
user-mode (imagine we're flushing the TB while a new pthread was just
spawned). This is easily triggered by greatly reducing the size of the
translation buffer, and spawning dozens of threads. This patch, as it
stands, won't catch the new threads coming in, because at the time
"safe work" was assigned, the new threads might not be seen by
CPU_FOREACH (btw, the CPU list should be converted to RCU, but a
ppc machine might be affected, see [1])

A possible fix is to sched safe work after exiting the CPU loop, i.e.
with qemu_get_cpu_work_mutex held. I tried this on v4 of this patchset
and doesn't scale very well on 64 cores (too much contention
on tb_lock), although at least it doesn't deadlock.

An alternative is to have a separate lock for safe work, and check for
safe work once there are no other locks held; a good place to do this is
at the beginning of cpu_loop_exec. This scales better, and I'd argue
it's simpler. In fact, I posted a patch that does this about a year
ago (!):
  https://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg02576.html
Paolo didn't like condvars, but now I see them coming up again. I guess
he still won't like the synchronize_rcu() call in there, and I don't like
it either, but I don't think that's an essential part of that patch.

Thanks,

                Emilio

[1] https://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg02581.html



reply via email to

[Prev in Thread] Current Thread [Next in Thread]