[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless
From: |
Peter Maydell |
Subject: |
Re: [Qemu-devel] [PATCH v2 05/11] target-arm: Implement ccmp branchless |
Date: |
Tue, 8 Sep 2015 09:19:43 +0100 |
On 8 September 2015 at 06:18, Richard Henderson <address@hidden> wrote:
> On 09/07/2015 10:31 AM, Peter Maydell wrote:
>>>
>>> - if (cond < 0x0e) { /* continue */
>>> - gen_set_label(label_continue);
>>> + /* If COND was false, force the flags to #nzcv.
>>> + Note that T1 = (COND ? 0 : -1), T2 = (COND ? -1 : 0). */
>>> + tcg_t1 = tcg_temp_new_i32();
>>> + tcg_t2 = tcg_temp_new_i32();
>>> + tcg_gen_neg_i32(tcg_t1, tcg_t0);
>>> + tcg_gen_subi_i32(tcg_t2, tcg_t0, 1);
>>
>>
>> t2 is ~t1, right? Do we get better/worse code if we use
>> tcg_gen_andc_i32(..., tcg_t1) rather than creating t2 and
>> using gen_and_i32 ?
>>
>>> +
>>> + if (nzcv & 8) { /* N */
>>> + tcg_gen_or_i32(cpu_NF, cpu_NF, tcg_t1);
>>> + } else {
>>> + tcg_gen_and_i32(cpu_NF, cpu_NF, tcg_t2);
>>> + }
>>> + if (nzcv & 4) { /* Z */
>>> + tcg_gen_and_i32(cpu_ZF, cpu_ZF, tcg_t2);
>>> + } else {
>>> + tcg_gen_or_i32(cpu_ZF, cpu_ZF, tcg_t0);
>>> + }
>>> + if (nzcv & 2) { /* C */
>>> + tcg_gen_or_i32(cpu_CF, cpu_CF, tcg_t0);
>>> + } else {
>>> + tcg_gen_and_i32(cpu_CF, cpu_CF, tcg_t2);
>>> + }
>>> + if (nzcv & 1) { /* V */
>>> + tcg_gen_or_i32(cpu_VF, cpu_VF, tcg_t1);
>>> + } else {
>>> + tcg_gen_and_i32(cpu_VF, cpu_VF, tcg_t2);
>
>
> If the host supports andc, it's probably better to use only the one temp.
> But otherwise we may save 4 not insns.
The tcg common code isn't smart enough to notice it only
needs to calculate not(t1) once ?
In the overwhelmingly common case (x86 tcg backend)
we would save an insn every time, right?
> Is it worth complicating the code
> for that?
I wouldn't bother to make the front-end generate different
code for the backend does/doesn't have andc situations,
certainly.
Anyway, I'm just guessing here, you probably have a better
feel than me for what codegen choices work better, so I'll
leave the choice up to you.
thanks
-- PMM
[Qemu-devel] [PATCH v2 06/11] target-arm: Implement fcsel with movcond, Richard Henderson, 2015/09/02
[Qemu-devel] [PATCH v2 07/11] target-arm: Recognize SXTB, SXTH, SXTW, ASR, Richard Henderson, 2015/09/02
[Qemu-devel] [PATCH v2 09/11] target-arm: Eliminate unnecessary zero-extend in disas_bitfield, Richard Henderson, 2015/09/02
[Qemu-devel] [PATCH v2 08/11] target-arm: Recognize UXTB, UXTH, LSR, LSL, Richard Henderson, 2015/09/02