|
From: | Richard Henderson |
Subject: | Re: [Qemu-devel] [RFC 02/30] tcg: add tcg_cmpxchg_lock |
Date: | Mon, 27 Jun 2016 13:07:42 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 |
On 06/27/2016 12:01 PM, Emilio G. Cota wrote:
This set of locks will allow us to correctly emulate cmpxchg16 in a parallel TCG. The key observation is that no architecture supports 16-byte regular atomic load/stores; only "locked" accesses (e.g. via cmpxchg16b on x86) are allowed, and therefore we can emulate them by using locks. We use a small array of locks so that we can have some scalability. Further improvements are possible (e.g. using a radix tree); but we should have a workload to benchmark in order to justify the additional complexity. Signed-off-by: Emilio G. Cota <address@hidden> --- cpu-exec.c | 1 + linux-user/main.c | 1 + tcg/tcg.h | 5 +++++ translate-all.c | 39 +++++++++++++++++++++++++++++++++++++++ 4 files changed, 46 insertions(+)
As formulated, this doesn't work.In order to support cmpxchg16 without a native one, you have to use locks on *all* operations, lest a 4-byte atomic operation and a 16-byte operation be simultaneous in the same address range.
Thankfully, the most common hosts (x86_64, aarch64, power7, s390x) do have a 16-byte cmpxchg, so this shouldn't really matter much in practice.
It would be nice to continue to support the other hosts (arm32, mips, ppc32, sparc, i686) without locks when the guest doesn't require wider atomics than the host suports.
r~
[Prev in Thread] | Current Thread | [Next in Thread] |