Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions

From:	Pranith Kumar
Subject:	Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions
Date:	Tue, 19 Jul 2016 14:55:15 -0400

Paolo Bonzini writes:

> On 14/07/2016 22:29, Pranith Kumar wrote:
>> +            } else if (curr_mb_type == TCG_BAR_STRL &&
>> +                       prev_mb_type == TCG_BAR_LDAQ) {
>> +                /* Consecutive load-acquire and store-release barriers
>> +                 * can be merged into one stronger SC barrier
>> +                 * ldaq; strl => ld; mb; st
>> +                 */
>> +                args[0] = (args[0] & 0x0F) | TCG_BAR_SC;
>> +                tcg_op_remove(s, prev_op);
>
> Is this really an optimization?  For example the processor could reorder
> "st1; ldaq1; strl2; ld2" to "ldaq1; ld2; st1; strl2".  It cannot do this
> if you change ldaq1/strl2 to ld1/mb/st2.
>
> On x86 for example a memory fence costs ~50 clock cycles, while normal
> loads and stores are of course faster.
>
> Of course this is useful if your target doesn't have ldaq/strl
> instructions.  In this case, however, you probably want to lower ldaq to
> "ld;mb" and strl to "mb;st"; the other optimizations then will remove
> the unnecessary barrier.
>

I agree that this is a conservative optimization. The problem is that
currently even for architectures which have ldaq/strl instructions, tcg
backend does not generate them. TCG just generates plain loads and stores.I
guess we didn't need to since it was single threaded MTTCG.

I am trying to add support to generate these instructions on AARCH64. Once
this is done we can disable the above optimization.

-- 
Pranith

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions, Pranith Kumar, 2016/07/14
- Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions, Richard Henderson, 2016/07/18
  - Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions, Pranith Kumar, 2016/07/19
- Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions, Alex Bennée, 2016/07/19
  - Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions, Pranith Kumar, 2016/07/19
- Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions, Alex Bennée, 2016/07/19
  - Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions, Pranith Kumar, 2016/07/19
- Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions, Paolo Bonzini, 2016/07/19
  - Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions, Pranith Kumar <=

Prev by Date: [Qemu-devel] [Bug 485239] Re: Windows 2008 datacenter- 64 bit , installation fails with qemu-system-x86_64 0.11.50
Next by Date: Re: [Qemu-devel] [PATCH v20 5/5] block/gluster: add support for multiple gluster servers
Previous by thread: Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions
Next by thread: Re: [Qemu-devel] [PATCH RFC 09/16] hw/i386/pc: don't use smp_cores, smp_threads
Index(es):
- Date
- Thread