[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] a
From: |
Peter Sewell |
Subject: |
Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations) |
Date: |
Tue, 18 Jun 2013 16:29:45 +0100 |
On 18 June 2013 15:50, Paul E. McKenney <address@hidden> wrote:
> On Tue, Jun 18, 2013 at 03:24:24PM +0200, Paolo Bonzini wrote:
>> Il 17/06/2013 20:57, Richard Henderson ha scritto:
>> >> + * And for the few ia64 lovers that exist, an atomic_mb_read is a ld.acq,
>> >> + * while an atomic_mb_set is a st.rel followed by a memory barrier.
>> > ...
>> >> + */
>> >> +#ifndef atomic_mb_read
>> >> +#if QEMU_GNUC_PREREQ(4, 8)
>> >> +#define atomic_mb_read(ptr) ({ \
>> >> + typeof(*ptr) _val; \
>> >> + __atomic_load(ptr, &_val, __ATOMIC_SEQ_CST); \
>> >> + _val; \
>> >> +})
>> >> +#else
>> >> +#define atomic_mb_read(ptr) ({ \
>> >> + typeof(*ptr) _val = atomic_read(ptr); \
>> >> + smp_rmb(); \
>> >> + _val; \
>> >
>> > This latter definition is ACQUIRE not SEQ_CST (except for ia64). Without
>> > load_acquire, one needs barriers before and after the atomic_read in order
>> > to
>> > implement SEQ_CST.
>>
>> The store-load barrier between atomic_mb_set and atomic_mb_read are
>> provided by the atomic_mb_set. The load-load barrier between two
>> atomic_mb_reads are provided by the first read.
>>
>> > So again I have to ask, what semantics are you actually looking for here?
>>
>> So, everything I found points to Java volatile being sequentially
>> consistent, though I'm still not sure why C11 suggests hwsync for load
>> seq-cst on POWER instead of lwsync. Comparing the sequences that my
>> code (based on the JSR-133 cookbook) generate with the C11 suggestions
>> you get:
>>
>> Load seqcst hwsync; ld; cmp; bc; isync
>> Store seqcst hwsync; st
>> (source: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html)
>>
>> Load Java volatile ld; lwsync
>> Store Java volatile lwsync; st; hwsync
>> (source: http://g.oswego.edu/dl/jmm/cookbook.html)
>>
>> where the lwsync in loads acts as a load-load barrier, while the one in
>> stores acts as load-store + store-store barrier.
(I don't know the context of these mails, but let me add a gloss to
correct a common misconception there: it's not sufficient to think
about
these things solely in terms of thread-local reordering, as the
tutorial that Paul points to
below explains. One has to think about the lack of multi-copy
atomicity too, as seen in that IRIW example)
>> Is the cookbook known to be wrong?
>>
>> Or is Java volatile somewhere between acq_rel and seq_cst, as the last
>> paragraph of
>> http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile
>> seems to suggest?
>
> First, I am not a fan of SC, mostly because there don't seem to be many
> (any?) production-quality algorithms that need SC. But if you really
> want to take a parallel-programming trip back to the 1980s, let's go! ;-)
>
> The canonical litmus test for SC is independent reads of independent
> writes, or IRIW for short. It is as follows, with x and y both initially
> zero:
>
> Thread 0 Thread 1 Thread 2 Thread 3
> x=1; y=1; r1=x; r2=y;
> r3=y; r4=x;
>
> On a sequentially consistent system, r1==1&&r2==1&&r3==0&&&r4==0 is
> a forbidden final outcome, where "final" means that we wait for all
> four threads to terminate and for their effects on memory to propagate
> fully through the system before checking the final values.
>
> The first mapping has been demonstrated to be correct at the URL
> you provide. So let's apply the second mapping:
>
> Thread 0 Thread 1 Thread 2 Thread 3
> lwsync; lwsync; r1=x; r2=y;
> x=1; y=1; lwsync; lwsync;
> hwsync; hwsync; r3=y; r4=x;
> lwsync; lwsync;
>
> Now barriers operate by forcing ordering between accesses on either
> side of the barrier. This means that barriers at the very start or
> the very end of a given thread's execution have no effect and may
> be ignored. This results in the following:
>
> Thread 0 Thread 1 Thread 2 Thread 3
> x=1; y=1; r1=x; r2=y;
> lwsync; lwsync;
> r3=y; r4=x;
>
> This sequence does -not- result in SC execution, as documented in the
> table at the beginning of Section 6.3 on page 23 of:
>
> http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf
>
> See the row with "IRIW+lwsync", which shows that this code sequence does
> not preserve SC ordering, neither in theory nor on real hardware. (And I
> strongly recommend this paper, even before having read it thoroughly
> myself, hence my taking the precaution of CCing the authors in case I
> am misinterpreting something.) If you want to prove it for yourself,
> use the software tool called out at http://lwn.net/Articles/470681/.
>
> So is this a real issue? Show me a parallel algorithm that relies on
> SC, demonstrate that it is the best algorithm for the problem that it
> solves, and then demonstrate that your problem statement is the best
> match for some important real-world problem. Until then, no, this is
> not a real issue.
>
> Thanx, Paul
>
> PS: Nevertheless, I personally prefer the C++ formulation, but that is
> only because I stand with one foot in theory and the other in
> practice. If I were a pure practitioner, I would probably strongly
> prefer the Java formulation.
>
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), (continued)
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Paul E. McKenney, 2013/06/18
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Paolo Bonzini, 2013/06/18
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Torvald Riegel, 2013/06/18
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Paul E. McKenney, 2013/06/18
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Paolo Bonzini, 2013/06/19
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Torvald Riegel, 2013/06/19
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Paolo Bonzini, 2013/06/19
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Torvald Riegel, 2013/06/19
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Paolo Bonzini, 2013/06/20
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Torvald Riegel, 2013/06/22
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations),
Peter Sewell <=
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Torvald Riegel, 2013/06/18
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Paul E. McKenney, 2013/06/18
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Torvald Riegel, 2013/06/19
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Paul E. McKenney, 2013/06/20
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Torvald Riegel, 2013/06/18
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Andrew Haley, 2013/06/18
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Paolo Bonzini, 2013/06/19
- Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations), Andrew Haley, 2013/06/19
[Qemu-devel] [PATCH v2 2/2] QEMUBH: make AioContext's bh re-entrant, Liu Ping Fan, 2013/06/16