qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 3/8] target-sh4: optimize addc using add2


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH v3 3/8] target-sh4: optimize addc using add2
Date: Thu, 04 Jun 2015 12:54:32 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0


On 04/06/2015 07:03, Richard Henderson wrote:
>> +            tcg_gen_add2_i32(t1, t2, REG(B11_8), t0, REG(B7_4), t0);
>> +            tcg_gen_add2_i32(REG(B11_8), cpu_sr_t, t1, t2, cpu_sr_t,
>> t0);
> 
> Swap these two adds and you don't need t2.  You can consume sr_t
> immediately and start producing it in the same go.

Could TCG do some kind of intra-basic-block live range splitting?  In
this case, the new sr_t could be allocated to a different register than
the old one, saving one instruction on 2-address targets.

The pseudocode below uses "dest, src" operand order:

   // add2(t1, cpu_sr_t, cpu_sr_t, t0, REG(B7_4), t0)
   add sr_t_in, B7_4    // instead of mov t1, sr_t; add t1, B7_4
   mov sr_t_out, 0
   adc sr_t_out, 0      // cout(B7_r + sr_t_in)

   // add2(REG(B11_8), cpu_sr_t, t1, cpu_sr_t, REG(B11_8), t0)
   add B11_8, sr_t_in   // B11_8 + B7_4 + sr_t_in
   adc sr_t_out, 0      // cout(B11_8 + B7_4 + sr_t_in)

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]