Re: [Qemu-devel] [PATCH 33/52] target-m68k: inline divu/divs

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 33/52] target-m68k: inline divu/divs

From:	Richard Henderson
Subject:	Re: [Qemu-devel] [PATCH 33/52] target-m68k: inline divu/divs
Date:	Fri, 6 May 2016 09:44:04 -1000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.0

On 05/04/2016 11:08 AM, Laurent Vivier wrote:

-void HELPER(divu)(CPUM68KState *env, uint32_t word)
-{
-    uint32_t num;
-    uint32_t den;
-    uint32_t quot;
-    uint32_t rem;
-
-    num = env->div1;
-    den = env->div2;
-    /* ??? This needs to make sure the throwing location is accurate.  */
-    if (den == 0) {
-        raise_exception(env, EXCP_DIV0);
-    }

Because of the exception, and required branching, I question taking thedivision operations back inline. The throwing location is easily fixed; createa version of raise_exception that has an argument for GETPC and usescpu_loop_exit_restore.

@@ -1179,64 +1187,182 @@ DISAS_INSN(mulw)

 DISAS_INSN(divw)
 {
+    TCGLabel *l1;
+    TCGv t0, src;
+    TCGv quot, rem;
     int sign;

     sign = (insn & 0x100) != 0;
+
+    tcg_gen_movi_i32(QREG_CC_C, 0); /* C is always cleared, use as 0 */
+
+    /* dest.l / src.w */
+
+    SRC_EA(env, t0, OS_WORD, sign, NULL);
+
+    src = tcg_temp_local_new_i32();
+    tcg_gen_mov_i32(src, t0);
+    l1 = gen_new_label();
+    tcg_gen_brcondi_i32(TCG_COND_NE, src, 0, l1);
+    tcg_gen_movi_i32(QREG_PC, s->insn_pc);
+    gen_raise_exception(EXCP_DIV0);
+    gen_set_label(l1);
+
+    tcg_gen_movi_i32(QREG_CC_C, 0); /* C is always cleared, use as 0 */

Redundant store to CC_C? Is that simply so that the 0 is forward propagatedwithin the second block by the tcg optimizer? If so, the comments could beclearer.

+    quot = tcg_temp_new();
+    rem = tcg_temp_new();
     if (sign) {
-        gen_helper_divs(cpu_env, tcg_const_i32(1));
+        tcg_gen_div_i32(quot, DREG(insn, 9), src);
+        tcg_gen_rem_i32(rem, DREG(insn, 9), src);
+        tcg_gen_ext16s_i32(QREG_CC_V, quot);
+        tcg_gen_movi_i32(src, -1);
+        tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V,
+                            QREG_CC_V, quot,
+                            QREG_CC_C /* 0 */, src /* -1 */);


setcond + neg is better than movcond here.

+        tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V,
+                            QREG_CC_V, QREG_CC_C /* 0 */,
+                            QREG_CC_V /* 0 */, src /* -1 */);


Likewise.

+    /* result rem:quot */
+
+    tcg_gen_ext16u_i32(quot, quot);
+    tcg_gen_deposit_i32(quot, quot, rem, 16, 16);


The ext16u is redundant with the deposit.

+    tcg_temp_free(rem);
+
+    /* on overflow, operands and flags are unaffected */
+
+    tcg_gen_movcond_i32(TCG_COND_EQ, DREG(insn, 9),
+                        QREG_CC_V, QREG_CC_C /* zero */,
+                        quot, DREG(insn, 9));
+    tcg_gen_ext16s_i32(quot, quot);
+    tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_Z,
+                        QREG_CC_V, QREG_CC_C /* zero */,
+                        quot, QREG_CC_Z);
+    tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_N,
+                        QREG_CC_V, QREG_CC_C /* zero */,
+                        quot, QREG_CC_N);
+    tcg_temp_free(quot);

Interesting. The manual says "may", as in "may or may not". Was the existingbehaviour that didn't have early check for overflow a bug, or simply a bug forexactly emulating specific cpu models?

Anyway, this multiplicity of movcond is another reason to consider leaving thecode out of line in a helper.

     set_cc_op(s, CC_OP_FLAGS);

This needs to happen before the branch, along with an update_cc_op, surely.Otherwise the store to CC_C doesn't necessarily have an effect on the computedflags.

+    if (sign) {
+        tcg_gen_ext32s_i64(quot64, quot64);
+        tcg_gen_extrh_i64_i32(rem, quot64);


This is a long way around to simply extract the same sign bit out of quot, i.e.

    tcg_gen_sari_i32(rem, quot, 31);

+        tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V,
+                            QREG_CC_V, rem,
+                            QREG_CC_C /* 0 */, minusone /* -1 */);


Again with setcond + neg vs movcond.


r~

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH 35/52] target-m68k: inline rotate ops, (continued)
- [Qemu-devel] [PATCH 40/52] target-m68k: add exg ops, Laurent Vivier, 2016/05/04
  - [Qemu-devel] [PATCH 41/52] target-m68k: add addressing modes to not, Laurent Vivier, 2016/05/04
    - Re: [Qemu-devel] [PATCH 41/52] target-m68k: add addressing modes to not, Richard Henderson, 2016/05/06
  - [Qemu-devel] [PATCH 44/52] target-m68k: and can manage word and byte operands, Laurent Vivier, 2016/05/04
    - Re: [Qemu-devel] [PATCH 44/52] target-m68k: and can manage word and byte operands, Richard Henderson, 2016/05/06
  - [Qemu-devel] [PATCH 42/52] target-m68k: eor can manage word and byte operands, Laurent Vivier, 2016/05/04
    - Re: [Qemu-devel] [PATCH 42/52] target-m68k: eor can manage word and byte operands, Richard Henderson, 2016/05/06
  - [Qemu-devel] [PATCH 45/52] target-m68k: suba/adda can manage word operand, Laurent Vivier, 2016/05/04
    - Re: [Qemu-devel] [PATCH 45/52] target-m68k: suba/adda can manage word operand, Richard Henderson, 2016/05/06
  - [Qemu-devel] [PATCH 46/52] target-m68k: introduce byte and word cc_ops, Laurent Vivier, 2016/05/04

Prev by Date: Re: [Qemu-devel] [PATCH 32/52] target-m68k: bitfield ops
Next by Date: Re: [Qemu-devel] [PATCH 34/52] target-m68k: add 64bit mull
Previous by thread: Re: [Qemu-devel] [PATCH 39/52] target-m68k: movem
Next by thread: [Qemu-devel] [PATCH 40/52] target-m68k: add exg ops
Index(es):
- Date
- Thread