qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 33/52] target-m68k: inline divu/divs


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH 33/52] target-m68k: inline divu/divs
Date: Fri, 6 May 2016 09:44:04 -1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.0

On 05/04/2016 11:08 AM, Laurent Vivier wrote:
-void HELPER(divu)(CPUM68KState *env, uint32_t word)
-{
-    uint32_t num;
-    uint32_t den;
-    uint32_t quot;
-    uint32_t rem;
-
-    num = env->div1;
-    den = env->div2;
-    /* ??? This needs to make sure the throwing location is accurate.  */
-    if (den == 0) {
-        raise_exception(env, EXCP_DIV0);
-    }

Because of the exception, and required branching, I question taking the division operations back inline. The throwing location is easily fixed; create a version of raise_exception that has an argument for GETPC and uses cpu_loop_exit_restore.

@@ -1179,64 +1187,182 @@ DISAS_INSN(mulw)

 DISAS_INSN(divw)
 {
+    TCGLabel *l1;
+    TCGv t0, src;
+    TCGv quot, rem;
     int sign;

     sign = (insn & 0x100) != 0;
+
+    tcg_gen_movi_i32(QREG_CC_C, 0); /* C is always cleared, use as 0 */
+
+    /* dest.l / src.w */
+
+    SRC_EA(env, t0, OS_WORD, sign, NULL);
+
+    src = tcg_temp_local_new_i32();
+    tcg_gen_mov_i32(src, t0);
+    l1 = gen_new_label();
+    tcg_gen_brcondi_i32(TCG_COND_NE, src, 0, l1);
+    tcg_gen_movi_i32(QREG_PC, s->insn_pc);
+    gen_raise_exception(EXCP_DIV0);
+    gen_set_label(l1);
+
+    tcg_gen_movi_i32(QREG_CC_C, 0); /* C is always cleared, use as 0 */

Redundant store to CC_C? Is that simply so that the 0 is forward propagated within the second block by the tcg optimizer? If so, the comments could be clearer.

+    quot = tcg_temp_new();
+    rem = tcg_temp_new();
     if (sign) {
-        gen_helper_divs(cpu_env, tcg_const_i32(1));
+        tcg_gen_div_i32(quot, DREG(insn, 9), src);
+        tcg_gen_rem_i32(rem, DREG(insn, 9), src);
+        tcg_gen_ext16s_i32(QREG_CC_V, quot);
+        tcg_gen_movi_i32(src, -1);
+        tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V,
+                            QREG_CC_V, quot,
+                            QREG_CC_C /* 0 */, src /* -1 */);

setcond + neg is better than movcond here.

+        tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V,
+                            QREG_CC_V, QREG_CC_C /* 0 */,
+                            QREG_CC_V /* 0 */, src /* -1 */);

Likewise.

+    /* result rem:quot */
+
+    tcg_gen_ext16u_i32(quot, quot);
+    tcg_gen_deposit_i32(quot, quot, rem, 16, 16);

The ext16u is redundant with the deposit.

+    tcg_temp_free(rem);
+
+    /* on overflow, operands and flags are unaffected */
+
+    tcg_gen_movcond_i32(TCG_COND_EQ, DREG(insn, 9),
+                        QREG_CC_V, QREG_CC_C /* zero */,
+                        quot, DREG(insn, 9));
+    tcg_gen_ext16s_i32(quot, quot);
+    tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_Z,
+                        QREG_CC_V, QREG_CC_C /* zero */,
+                        quot, QREG_CC_Z);
+    tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_N,
+                        QREG_CC_V, QREG_CC_C /* zero */,
+                        quot, QREG_CC_N);
+    tcg_temp_free(quot);

Interesting. The manual says "may", as in "may or may not". Was the existing behaviour that didn't have early check for overflow a bug, or simply a bug for exactly emulating specific cpu models?

Anyway, this multiplicity of movcond is another reason to consider leaving the code out of line in a helper.

     set_cc_op(s, CC_OP_FLAGS);

This needs to happen before the branch, along with an update_cc_op, surely. Otherwise the store to CC_C doesn't necessarily have an effect on the computed flags.

+    if (sign) {
+        tcg_gen_ext32s_i64(quot64, quot64);
+        tcg_gen_extrh_i64_i32(rem, quot64);

This is a long way around to simply extract the same sign bit out of quot, i.e.

    tcg_gen_sari_i32(rem, quot, 31);

+        tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_V,
+                            QREG_CC_V, rem,
+                            QREG_CC_C /* 0 */, minusone /* -1 */);

Again with setcond + neg vs movcond.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]