[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC v6 00/14] Slow-path for atomic instruction transla
From: |
alvise rigo |
Subject: |
Re: [Qemu-devel] [RFC v6 00/14] Slow-path for atomic instruction translation |
Date: |
Tue, 15 Dec 2015 14:59:39 +0100 |
Hi Paolo,
On Mon, Dec 14, 2015 at 11:17 AM, Paolo Bonzini <address@hidden> wrote:
>
>
> On 14/12/2015 11:04, alvise rigo wrote:
>> In any case, what I proposed in the mttcg based v5 was:
>> - A LL ensures that the TLB_EXCL flag is set on all the CPU's TLB.
>> This is done by querying a TLB flush to all (not exactly all...) the
>> CPUs. To be 100% safe, probably we should also wait that the flush is
>> actually performed
>> - A TLB_EXCL flag set always forces the slow-path, allowing the CPUs
>> to check for possible collision with a "exclusive memory region"
>>
>> Now, why the fact of querying the flush (and possibly ensuring that
>> the flush has been actually done) should not be enough?
>
> There will always be a race where the normal store fails. While I
> haven't studied your code enough to do a constructive proof, it's enough
> to prove the impossibility of what you're trying to do. Mind, I also
> believed for a long time that it was possible to do it!
>
> If we have two CPUs, with CPU 0 executing LL and the CPU 1 executing a
> store, you can model this as a consensus problem. For example, CPU 0
> could propose that the subsequent SC succeeds, while CPU 1 proposes that
> it fails. The outcome of the SC instruction depends on who wins.
I see your point. This, as you wrote, holds only when we attempt to
make the fast path wait-free.
However, the implementation I proposed is not wait-free and somehow
serializes the accesses made to the shared resources (that will
determine if the access was successful or not) by means of a mutex.
The assumption I made - and somehow verified - is that the "colliding
fast accesses" are rare. I guess you also agree on this, otherwise how
could a wait-free implementation possibly work without being coupled
with primitives with appropriate consensus number?
Thank you,
alvise
>
> Therefore, implementing LL/SC problem requires---on both CPU 0 and CPU
> 1, and hence for both LL/SC and normal store---an atomic primitive with
> consensus number >= 2. Other than LL/SC itself, the commonly-available
> operations satisfying this requirement are test-and-set (consensus
> number 2) and compare-and-swap (infinite consensus number). Normal
> memory reads and writes (called "atomic registers" in multi-processing
> research lingo) have consensus number 1; it's not enough.
>
> If the host had LL/SC, CPU 1 could in principle delegate its side of the
> consensus problem to the processor; but even that is not a solution
> because processors constrain the instructions that can appear between
> the load and the store, and this could cause an infinite sequence of
> spurious failed SCs. Another option is transactional memory, but it's
> also too slow for normal stores.
>
> The simplest solution is not to implement full LL/SC semantics; instead,
> similar to linux-user, a SC operation can perform a cmpxchg from the
> value fetched by LL to the argument of SC. This bypasses the issue
> because stores do not have to be instrumented at all, but it does mean
> that the emulation suffers from the ABA problem.
>
> TLB_EXCL is also a middle-ground, a little bit stronger than cmpxchg.
> It's more complex and more accurate, but also not perfect. Which is
> okay, but has to be documented.
>
> Paolo
- Re: [Qemu-devel] [RFC v6 12/14] softmmu: Simplify helper_*_st_name, wrap RAM code, (continued)
- [Qemu-devel] [RFC v6 13/14] softmmu: Include MMIO/invalid exclusive accesses, Alvise Rigo, 2015/12/14
- [Qemu-devel] [RFC v6 14/14] softmmu: Protect MMIO exclusive range, Alvise Rigo, 2015/12/14
- [Qemu-devel] [RFC v6 11/14] softmmu: Simplify helper_*_st_name, wrap MMIO code, Alvise Rigo, 2015/12/14
- [Qemu-devel] [RFC v6 09/14] softmmu: Add history of excl accesses, Alvise Rigo, 2015/12/14
- Re: [Qemu-devel] [RFC v6 00/14] Slow-path for atomic instruction translation, Paolo Bonzini, 2015/12/14
Re: [Qemu-devel] [RFC v6 00/14] Slow-path for atomic instruction translation, Andreas Tobler, 2015/12/14
Re: [Qemu-devel] [RFC v6 00/14] Slow-path for atomic instruction translation, Alex Bennée, 2015/12/17