[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PULL 11/34] target/i386: emulate: microoptimize and explain ADD_COUT_VE
From: |
Paolo Bonzini |
Subject: |
[PULL 11/34] target/i386: emulate: microoptimize and explain ADD_COUT_VEC/SUB_COUT_VEC |
Date: |
Wed, 23 Apr 2025 11:40:41 +0200 |
The logic is the same, but the majority(NOT a, b, c) is brought out
to a separate macro and implemented without NOT operations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/hvf/x86_flags.c | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/target/i386/hvf/x86_flags.c b/target/i386/hvf/x86_flags.c
index fedc70a1b80..60ab4f01a20 100644
--- a/target/i386/hvf/x86_flags.c
+++ b/target/i386/hvf/x86_flags.c
@@ -45,14 +45,30 @@
#define LF_MASK_CF (0x01 << LF_BIT_CF)
#define LF_MASK_PO (0x01 << LF_BIT_PO)
+/* majority(NOT a, b, c) = (a ^ b) ? b : c */
+#define MAJ_INV1(a, b, c) ((((a) ^ (b)) & ((b) ^ (c))) ^ (c))
+
+/*
+ * ADD_COUT_VEC(x, y) = majority((x + y) ^ x ^ y, x, y)
+ *
+ * If two corresponding bits in x and y are the same, that's the carry
+ * independent of the value (x+y)^x^y. Hence x^y can be replaced with
+ * 1 in (x+y)^x^y, resulting in majority(NOT (x+y), x, y)
+ */
#define ADD_COUT_VEC(op1, op2, result) \
- (((op1) & (op2)) | (((op1) | (op2)) & (~(result))))
+ MAJ_INV1(result, op1, op2)
+/*
+ * SUB_COUT_VEC(x, y) = NOT majority(x, NOT y, (x - y) ^ x ^ NOT y)
+ * = majority(NOT x, y, (x - y) ^ x ^ y)
+ *
+ * Note that the carry out is actually a borrow, i.e. it is inverted.
+ * If two corresponding bits in x and y are different, the value of the
+ * bit in (x-y)^x^y likewise does not matter. Hence, x^y can be replaced
+ * with 0 in (x-y)^x^y, resulting in majority(NOT x, y, x-y)
+ */
#define SUB_COUT_VEC(op1, op2, result) \
- (((~(op1)) & (op2)) | (((~(op1)) ^ (op2)) & (result)))
-
-#define GET_ADD_OVERFLOW(op1, op2, result, mask) \
- ((((op1) ^ (result)) & ((op2) ^ (result))) & (mask))
+ MAJ_INV1(op1, op2, result)
/* ******************* */
/* OSZAPC */
--
2.49.0
- [PULL 00/34] i386, Rust, SCSI changes for 2025-04-23, Paolo Bonzini, 2025/04/23
- [PULL 01/34] scsi: add conversion from ENODEV to sense, Paolo Bonzini, 2025/04/23
- [PULL 02/34] target/i386: Fix model number of Zhaoxin YongFeng vCPU template, Paolo Bonzini, 2025/04/23
- [PULL 04/34] target/i386/hvf: fix lflags_to_rflags, Paolo Bonzini, 2025/04/23
- [PULL 03/34] target/i386: Reset parked vCPUs together with the online ones, Paolo Bonzini, 2025/04/23
- [PULL 05/34] target/i386: special case ADC/SBB x,0 and SBB x,x, Paolo Bonzini, 2025/04/23
- [PULL 11/34] target/i386: emulate: microoptimize and explain ADD_COUT_VEC/SUB_COUT_VEC,
Paolo Bonzini <=
- [PULL 14/34] target/i386/hvf: remove HVF specific calls from x86_decode.c, Paolo Bonzini, 2025/04/23
- [PULL 07/34] target/i386: tcg: remove subf from SHLD/SHRD expansion, Paolo Bonzini, 2025/04/23
- [PULL 16/34] target/i386: rename hvf_mmio_buf to emu_mmio_buf, Paolo Bonzini, 2025/04/23
- [PULL 10/34] target/i386: tcg: simplify computation of AF after INC/DEC, Paolo Bonzini, 2025/04/23
- [PULL 08/34] target/i386: tcg: remove tmp0, Paolo Bonzini, 2025/04/23
- [PULL 06/34] target/i386: tcg: remove tmp0 and tmp4 from SHLD/SHRD, Paolo Bonzini, 2025/04/23
- [PULL 13/34] target/i386/hvf: introduce x86_emul_ops, Paolo Bonzini, 2025/04/23
- [PULL 20/34] target/i386: rename lazy flags field and its type, Paolo Bonzini, 2025/04/23
- [PULL 15/34] target/i386/hvf: provide and use handle_io in emul_ops, Paolo Bonzini, 2025/04/23
- [PULL 17/34] target/i386/hvf: use emul_ops->read_mem in x86_emu.c, Paolo Bonzini, 2025/04/23