[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PULL 09/46] target/i386: avoid trunc and ext for MULX and RORX
From: |
Paolo Bonzini |
Subject: |
[PULL 09/46] target/i386: avoid trunc and ext for MULX and RORX |
Date: |
Sun, 31 Dec 2023 09:44:25 +0100 |
Use _tl operations for 32-bit operands on 32-bit targets, and only go
through trunc and extu ops for 64-bit targets. While the trunc/ext
ops should be pretty much free after optimization, the optimizer also
does not like having the same temporary used in multiple EBBs.
Therefore it is nicer to not use tmpN* unless necessary.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/tcg/emit.c.inc | 37 +++++++++++++++++++++++++------------
1 file changed, 25 insertions(+), 12 deletions(-)
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 98c4c9569ef..f5e44117eab 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1348,7 +1348,8 @@ static void gen_MULX(DisasContext *s, CPUX86State *env,
X86DecodedInsn *decode)
/* low part of result in VEX.vvvv, high in MODRM */
switch (ot) {
- default:
+ case MO_32:
+#ifdef TARGET_X86_64
tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T1);
tcg_gen_mulu2_i32(s->tmp2_i32, s->tmp3_i32,
@@ -1356,13 +1357,15 @@ static void gen_MULX(DisasContext *s, CPUX86State *env,
X86DecodedInsn *decode)
tcg_gen_extu_i32_tl(cpu_regs[s->vex_v], s->tmp2_i32);
tcg_gen_extu_i32_tl(s->T0, s->tmp3_i32);
break;
-#ifdef TARGET_X86_64
- case MO_64:
- tcg_gen_mulu2_i64(cpu_regs[s->vex_v], s->T0, s->T0, s->T1);
- break;
-#endif
- }
+ case MO_64:
+#endif
+ tcg_gen_mulu2_tl(cpu_regs[s->vex_v], s->T0, s->T0, s->T1);
+ break;
+
+ default:
+ g_assert_not_reached();
+ }
}
static void gen_PALIGNR(DisasContext *s, CPUX86State *env, X86DecodedInsn
*decode)
@@ -1765,14 +1768,24 @@ static void gen_PSLLDQ_i(DisasContext *s, CPUX86State
*env, X86DecodedInsn *deco
static void gen_RORX(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
{
MemOp ot = decode->op[0].ot;
- int b = decode->immediate;
+ int mask = ot == MO_64 ? 63 : 31;
+ int b = decode->immediate & mask;
- if (ot == MO_64) {
- tcg_gen_rotri_tl(s->T0, s->T0, b & 63);
- } else {
+ switch (ot) {
+ case MO_32:
+#ifdef TARGET_X86_64
tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
- tcg_gen_rotri_i32(s->tmp2_i32, s->tmp2_i32, b & 31);
+ tcg_gen_rotri_i32(s->tmp2_i32, s->tmp2_i32, b);
tcg_gen_extu_i32_tl(s->T0, s->tmp2_i32);
+ break;
+
+ case MO_64:
+#endif
+ tcg_gen_rotri_tl(s->T0, s->T0, b);
+ break;
+
+ default:
+ g_assert_not_reached();
}
}
--
2.43.0
- [PULL 00/46] (mostly) target/i386 and meson changes for 2023-12-31, Paolo Bonzini, 2023/12/31
- [PULL 01/46] configure: use a native non-cross compiler for linux-user, Paolo Bonzini, 2023/12/31
- [PULL 02/46] target/i386: optimize computation of JL and JLE from flags, Paolo Bonzini, 2023/12/31
- [PULL 05/46] target/i386: remove unnecessary truncations, Paolo Bonzini, 2023/12/31
- [PULL 07/46] target/i386: document more deviations from the manual, Paolo Bonzini, 2023/12/31
- [PULL 03/46] target/i386: speedup JO/SETO after MUL or IMUL, Paolo Bonzini, 2023/12/31
- [PULL 04/46] target/i386: remove unnecessary arguments from raise_interrupt, Paolo Bonzini, 2023/12/31
- [PULL 06/46] target/i386: clean up cpu_cc_compute_all, Paolo Bonzini, 2023/12/31
- [PULL 08/46] target/i386: reimplement check for validity of LOCK prefix, Paolo Bonzini, 2023/12/31
- [PULL 09/46] target/i386: avoid trunc and ext for MULX and RORX,
Paolo Bonzini <=
- [PULL 10/46] target/i386: rename zext0/zext2 and make them closer to the manual, Paolo Bonzini, 2023/12/31
- [PULL 11/46] target/i386: add X86_SPECIALs for MOVSX and MOVZX, Paolo Bonzini, 2023/12/31
- [PULL 14/46] target/i386: do not clobber T0 on string operations, Paolo Bonzini, 2023/12/31
- [PULL 16/46] target/i386: do not use s->tmp4 for push, Paolo Bonzini, 2023/12/31
- [PULL 15/46] target/i386: split eflags computation out of gen_compute_eflags, Paolo Bonzini, 2023/12/31
- [PULL 12/46] target/i386: do not decode string source/destination into decode->mem, Paolo Bonzini, 2023/12/31
- [PULL 13/46] target/i386: do not clobber A0 in POP translation, Paolo Bonzini, 2023/12/31
- [PULL 18/46] target/i386: prepare for implementation of STOS/SCAS in new decoder, Paolo Bonzini, 2023/12/31
- [PULL 17/46] target/i386: do not use s->tmp0 for jumps on ECX ==/!= 0, Paolo Bonzini, 2023/12/31
- [PULL 19/46] target/i386: move operand load and writeback out of gen_cmovcc1, Paolo Bonzini, 2023/12/31