qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory bar


From: Pranith Kumar
Subject: Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory barrier
Date: Tue, 21 Jun 2016 10:52:08 -0400

Hi Sergey,

On Mon, Jun 20, 2016 at 5:21 PM, Sergey Fedorov <address@hidden> wrote:
> On 18/06/16 07:03, Pranith Kumar wrote:
>> diff --git a/tcg/tcg.h b/tcg/tcg.h
>> index db6a062..36feca9 100644
>> --- a/tcg/tcg.h
>> +++ b/tcg/tcg.h
>> @@ -408,6 +408,20 @@ static inline intptr_t QEMU_ARTIFICIAL 
>> GET_TCGV_PTR(TCGv_ptr t)
>>  #define TCG_CALL_DUMMY_TCGV     MAKE_TCGV_I32(-1)
>>  #define TCG_CALL_DUMMY_ARG      ((TCGArg)(-1))
>>
>> +typedef enum {
>> +    TCG_MO_LD_LD    = 1,
>> +    TCG_MO_ST_LD    = 2,
>> +    TCG_MO_LD_ST    = 4,
>> +    TCG_MO_ST_ST    = 8,
>> +    TCG_MO_ALL      = 0xF, // OR of all above
>
> So TCG_MO_ALL specifies a so called "full" memory barrier?

This enum just specifies what loads and stores need to be ordered.

TCG_MO_ALL specifies that we need to order both previous loads and
stores with later loads and stores. To get a full memory barrier you
will need to pair it with BAR_SC:

TCG_MO_ALL | TCG_BAR_SC

>
>> +} TCGOrder;
>> +
>> +typedef enum {
>> +    TCG_BAR_ACQ     = 32,
>> +    TCG_BAR_REL     = 64,
>
> I'm convinced that the only practical way to represent a standalone
> acquire memory barrier is to order all previous loads with all
> subsequent loads and stores. Similarly, a standalone release memory
> barrier would order all previous loads and stores with all subsequent
> stores. [1]

Yes, here acquire would be:

(TCG_MO_LD_ST | TCG_MO_LD_LD) | TCG_BAR_ACQ

and release would be:

(TCG_MO_ST_ST | TCG_MO_LD_ST) | TCG_BAR_REL

>
> On the other hand, acquire or release semantic associated with a memory
> operation itself can be directly mapped into e.g. AArch64's Load-Acquire
> (LDAR) and Store-Release (STLR) instructions. A standalone barrier
> adjacent to a memory operation shouldn't be mapped this way because it
> should provide more strict guarantees than e.g. AArch64 instructions
> mentioned above.

You are right. That is why the load-acquire operation generates the
stronger barrier:

TCG_MO_ALL | TCG_BAR_ACQ and not the acquire barrier above. Similarly
for store-release.

>
> Therefore, I advocate for clear distinction between standalone memory
> barriers and implicit memory ordering semantics associated with memory
> operations themselves.

Any suggestions on how to make the distinction clearer? I will add a
detailed comment like the above but please let me know if you have
anything in mind.

>
> [1] http://preshing.com/20130922/acquire-and-release-fences/
>
>> +    TCG_BAR_SC      = 128,
>
> How's that different from TCG_MO_ALL?

TCG_BAR_* tells us what ordering is enforced. TCG_MO_* tells what on
what operations the ordering is to be enforced.

Thanks,
-- 
Pranith



reply via email to

[Prev in Thread] Current Thread [Next in Thread]