[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 33/52] target-m68k: inline divu/divs
From: |
Richard Henderson |
Subject: |
Re: [Qemu-devel] [PATCH 33/52] target-m68k: inline divu/divs |
Date: |
Fri, 6 May 2016 09:44:04 -1000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.0 |
On 05/04/2016 11:08 AM, Laurent Vivier wrote:
-void HELPER(divu)(CPUM68KState *env, uint32_t word)
-{
- uint32_t num;
- uint32_t den;
- uint32_t quot;
- uint32_t rem;
-
- num = env->div1;
- den = env->div2;
- /* ??? This needs to make sure the throwing location is accurate. */
- if (den == 0) {
- raise_exception(env, EXCP_DIV0);
- }
Because of the exception, and required branching, I question taking the
division operations back inline. The throwing location is easily fixed; create
a version of raise_exception that has an argument for GETPC and uses
cpu_loop_exit_restore.
@@ -1179,64 +1187,182 @@ DISAS_INSN(mulw)
DISAS_INSN(divw)
{
+ TCGLabel *l1;
+ TCGv t0, src;
+ TCGv quot, rem;
int sign;
sign = (insn & 0x100) != 0;
+
+ tcg_gen_movi_i32(QREG_CC_C, 0); /* C is always cleared, use as 0 */
+
+ /* dest.l / src.w */
+
+ SRC_EA(env, t0, OS_WORD, sign, NULL);
+
+ src = tcg_temp_local_new_i32();
+ tcg_gen_mov_i32(src, t0);
+ l1 = gen_new_label();
+ tcg_gen_brcondi_i32(TCG_COND_NE, src, 0, l1);
+ tcg_gen_movi_i32(QREG_PC, s->insn_pc);
+ gen_raise_exception(EXCP_DIV0);
+ gen_set_label(l1);
+
+ tcg_gen_movi_i32(QREG_CC_C, 0); /* C is always cleared, use as 0 */
Redundant store to CC_C? Is that simply so that the 0 is forward propagated
within the second block by the tcg optimizer? If so, the comments could be
clearer.
+ quot = tcg_temp_new();
+ rem = tcg_temp_new();
if (sign) {
- gen_helper_divs(cpu_env, tcg_const_i32(1));
+ tcg_gen_div_i32(quot, DREG(insn, 9), src);
+ tcg_gen_rem_i32(rem, DREG(insn, 9), src);
+ tcg_gen_ext16s_i32(QREG_CC_V, quot);
+ tcg_gen_movi_i32(src, -1);
+ tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V,
+ QREG_CC_V, quot,
+ QREG_CC_C /* 0 */, src /* -1 */);
setcond + neg is better than movcond here.
+ tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V,
+ QREG_CC_V, QREG_CC_C /* 0 */,
+ QREG_CC_V /* 0 */, src /* -1 */);
Likewise.
+ /* result rem:quot */
+
+ tcg_gen_ext16u_i32(quot, quot);
+ tcg_gen_deposit_i32(quot, quot, rem, 16, 16);
The ext16u is redundant with the deposit.
+ tcg_temp_free(rem);
+
+ /* on overflow, operands and flags are unaffected */
+
+ tcg_gen_movcond_i32(TCG_COND_EQ, DREG(insn, 9),
+ QREG_CC_V, QREG_CC_C /* zero */,
+ quot, DREG(insn, 9));
+ tcg_gen_ext16s_i32(quot, quot);
+ tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_Z,
+ QREG_CC_V, QREG_CC_C /* zero */,
+ quot, QREG_CC_Z);
+ tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_N,
+ QREG_CC_V, QREG_CC_C /* zero */,
+ quot, QREG_CC_N);
+ tcg_temp_free(quot);
Interesting. The manual says "may", as in "may or may not". Was the existing
behaviour that didn't have early check for overflow a bug, or simply a bug for
exactly emulating specific cpu models?
Anyway, this multiplicity of movcond is another reason to consider leaving the
code out of line in a helper.
set_cc_op(s, CC_OP_FLAGS);
This needs to happen before the branch, along with an update_cc_op, surely.
Otherwise the store to CC_C doesn't necessarily have an effect on the computed
flags.
+ if (sign) {
+ tcg_gen_ext32s_i64(quot64, quot64);
+ tcg_gen_extrh_i64_i32(rem, quot64);
This is a long way around to simply extract the same sign bit out of quot, i.e.
tcg_gen_sari_i32(rem, quot, 31);
+ tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V,
+ QREG_CC_V, rem,
+ QREG_CC_C /* 0 */, minusone /* -1 */);
Again with setcond + neg vs movcond.
r~
- [Qemu-devel] [PATCH 35/52] target-m68k: inline rotate ops, (continued)
- [Qemu-devel] [PATCH 35/52] target-m68k: inline rotate ops, Laurent Vivier, 2016/05/04
- [Qemu-devel] [PATCH 36/52] target-m68k: inline shift ops, Laurent Vivier, 2016/05/04
- [Qemu-devel] [PATCH 37/52] target-m68k: add cas/cas2 ops, Laurent Vivier, 2016/05/04
- [Qemu-devel] [PATCH 38/52] target-m68k: add linkl, Laurent Vivier, 2016/05/04
- [Qemu-devel] [PATCH 39/52] target-m68k: movem, Laurent Vivier, 2016/05/04
- Re: [Qemu-devel] [PATCH 33/52] target-m68k: inline divu/divs,
Richard Henderson <=
- [Qemu-devel] [PATCH 40/52] target-m68k: add exg ops, Laurent Vivier, 2016/05/04
- [Qemu-devel] [PATCH 41/52] target-m68k: add addressing modes to not, Laurent Vivier, 2016/05/04
- [Qemu-devel] [PATCH 44/52] target-m68k: and can manage word and byte operands, Laurent Vivier, 2016/05/04
- [Qemu-devel] [PATCH 42/52] target-m68k: eor can manage word and byte operands, Laurent Vivier, 2016/05/04
- [Qemu-devel] [PATCH 45/52] target-m68k: suba/adda can manage word operand, Laurent Vivier, 2016/05/04
- [Qemu-devel] [PATCH 46/52] target-m68k: introduce byte and word cc_ops, Laurent Vivier, 2016/05/04