qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [PATCH 14/67] target/arm: Convert multiply and multiply a


From: Richard Henderson
Subject: Re: [Qemu-arm] [PATCH 14/67] target/arm: Convert multiply and multiply accumulate
Date: Mon, 5 Aug 2019 09:20:02 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 8/5/19 8:32 AM, Peter Maydell wrote:
>> -/* load a 32-bit value from a register and perform a 64-bit accumulate.  */
>> -static void gen_addq_lo(DisasContext *s, TCGv_i64 val, int rlow)
>> -{
>> -    TCGv_i64 tmp;
>> -    TCGv_i32 tmp2;
>> -
>> -    /* Load value and extend to 64 bits.  */
>> -    tmp = tcg_temp_new_i64();
>> -    tmp2 = load_reg(s, rlow);
>> -    tcg_gen_extu_i32_i64(tmp, tmp2);
>> -    tcg_temp_free_i32(tmp2);
>> -    tcg_gen_add_i64(val, val, tmp);
>> -    tcg_temp_free_i64(tmp);
>> -}
>> -
> 
>> +static bool trans_UMAAL(DisasContext *s, arg_UMAAL *a)
>> +{
>> +    TCGv_i32 t0, t1, t2, zero;
>> +
>> +    if (s->thumb
>> +        ? !arm_dc_feature(s, ARM_FEATURE_THUMB_DSP)
>> +        : !ENABLE_ARCH_6) {
>> +        return false;
>> +    }
>> +
>> +    t0 = load_reg(s, a->rm);
>> +    t1 = load_reg(s, a->rn);
>> +    tcg_gen_mulu2_i32(t0, t1, t0, t1);
>> +    zero = tcg_const_i32(0);
>> +    t2 = load_reg(s, a->ra);
>> +    tcg_gen_add2_i32(t0, t1, t0, t1, t2, zero);
>> +    tcg_temp_free_i32(t2);
>> +    t2 = load_reg(s, a->rd);
>> +    tcg_gen_add2_i32(t0, t1, t0, t1, t2, zero);
>> +    tcg_temp_free_i32(t2);
>> +    tcg_temp_free_i32(zero);
>> +    store_reg(s, a->ra, t0);
>> +    store_reg(s, a->rd, t1);
>> +    return true;
>> +
> 
> Is using mulu2/add2/add2 like this really generating better
> code than the mulu_i64_i32 and 2 64-bit adds that we had before?
> If we're going to change how we're generating code it would be
> nice to at least mention it in the commit message...

I didn't really think about the code generation difference, merely that it
seemed more obvious, given that all of the inputs are i32, and we need i32
outputs.  I assumed it wasn't written like this in the first place because
tcg_gen_mulu2_i32 is relatively new.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]