qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 6/9] target-sh4: split out Q and M from of SR


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH v2 6/9] target-sh4: split out Q and M from of SR and optimize div1
Date: Tue, 24 Dec 2013 06:44:54 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

On 12/22/2013 03:50 AM, Aurelien Jarno wrote:
>  static void gen_read_sr(TCGv dst)
>  {
> -    tcg_gen_andi_i32(dst, cpu_sr, ~(1u << SR_T));
> -    tcg_gen_or_i32(dst, dst, cpu_sr_t);
> +    TCGv t0 = tcg_temp_new();
> +    tcg_gen_andi_i32(dst, cpu_sr,
> +                     ~((1u << SR_Q) | (1u << SR_M) | (1u << SR_T)));
> +    tcg_gen_shli_i32(t0, cpu_sr_q, SR_Q);
> +    tcg_gen_or_i32(dst, dst, t0);
> +    tcg_gen_shli_i32(t0, cpu_sr_m, SR_M);
> +    tcg_gen_or_i32(dst, dst, t0);
> +    tcg_gen_shli_i32(t0, cpu_sr_t, SR_T);
> +    tcg_gen_or_i32(dst, dst, t0);
> +    tcg_temp_free_i32(t0);
>  }

Similar comments for SR_[QM] as for SR_T wrt who clears the relevant bits in
env->sr.


>      case 0x2007:             /* div0s Rm,Rn */
>       {
> -            gen_copy_bit_i32(cpu_sr, SR_Q, REG(B11_8), 31);     /* SR_Q */
> -            gen_copy_bit_i32(cpu_sr, SR_M, REG(B7_4), 31);      /* SR_M */
> +            tcg_gen_shri_i32(cpu_sr_q, REG(B11_8), 31);         /* SR_Q */
> +            tcg_gen_mov_i32(cpu_sr_m, cpu_sr_q);                /* SR_M */
>           TCGv val = tcg_temp_new();
>              tcg_gen_xor_i32(cpu_sr_t, REG(B7_4), REG(B11_8));
>              tcg_gen_shri_i32(cpu_sr_t, cpu_sr_t, 31);           /* SR_T */

Error setting M.  Q and M are set from different source registers.

And as a point of optimization, T no longer needs the shift if one uses the
extracted Q and M as inputs.

> +            /* add or subtract arg0 from arg1 depending if Q == M */
> +            tcg_gen_xor_i32(t1, cpu_sr_q, cpu_sr_m);
> +            tcg_gen_subi_i32(t1, t1, 1);
> +            tcg_gen_neg_i32(t2, REG(B7_4));
> +            tcg_gen_movcond_i32(TCG_COND_EQ, t2, t1, zero, REG(B7_4), t2);
> +            tcg_gen_add2_i32(REG(B11_8), t1, REG(B11_8), zero, t2, t1);

Why so complicated with the comparison?  I'd have expected

  tcg_gen_movcond_i32(TCG_COND_EQ, t2, cpu_sr_q, cpu_sr_m, REG(B7_4), t2);

Hmm... except I see you're re-using the condition as the high-part of the add2.
 That's pretty tricky.  Perhaps expand upon the comment?


r~




reply via email to

[Prev in Thread] Current Thread [Next in Thread]