[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH 2/8] target/ppc: Optimize emulation of vsl and vsr i
From: |
Stefan Brankovic |
Subject: |
[Qemu-devel] [PATCH 2/8] target/ppc: Optimize emulation of vsl and vsr instructions |
Date: |
Thu, 6 Jun 2019 12:15:24 +0200 |
Optimization of altivec instructions vsl and vsr(Vector Shift Left/Rigt).
Perform shift operation (left and right respectively) on 128 bit value of
register vA by value specified in bits 125-127 of register vB. Lowest 3
bits in each byte element of register vB must be identical or result is
undefined.
For vsl instruction we do this by first saving bits 125-127
of register vB in variable sh. Then we save highest sh bits
of lower doubleword element of register vA in variable shifted,
so we don't lose those bits when we perform shift operation on
lower doubleword element of register vA, which is our next
step. After shifting lower doubleword element we perform shift
operation on higher doubleword element of vA and replace
lowest sh bits(that are now 0) with bits saved in shifted.
For vsr instruction we do this by first saving bits 125-127
of register vB in variable sh. Then we save lowest sh bits
of higher doubleword element of register vA in variable shifted,
so we don't lose those bits when we perform shift operation on
higher doubleword element of register vA, which is our next step.
After shifting higher doubleword element we perform shift
operation on lower doubleword element of vA and replace
highest sh bits(that are now 0) with bits saved in shifted.
Signed-off-by: Stefan Brankovic <address@hidden>
---
target/ppc/translate/vmx-impl.inc.c | 101 +++++++++++++++++++++++++++++++++++-
1 file changed, 99 insertions(+), 2 deletions(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b/target/ppc/translate/vmx-impl.inc.c
index 140bb05..6bd072a 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -592,6 +592,103 @@ static void trans_lvsr(DisasContext *ctx)
tcg_temp_free(EA);
}
+/*
+ * vsl VRT,VRA,VRB - Vector Shift Left
+ *
+ * Shifting left 128 bit value of vA by value specified in bits 125-127 of vB.
+ * Lowest 3 bits in each byte element of register vB must be identical or
+ * result is undefined.
+ */
+static void trans_vsl(DisasContext *ctx)
+{
+ int VT = rD(ctx->opcode);
+ int VA = rA(ctx->opcode);
+ int VB = rB(ctx->opcode);
+ TCGv_i64 avrA = tcg_temp_new_i64();
+ TCGv_i64 avrB = tcg_temp_new_i64();
+ TCGv_i64 sh = tcg_temp_new_i64();
+ TCGv_i64 shifted = tcg_temp_new_i64();
+ TCGv_i64 tmp = tcg_temp_new_i64();
+
+ /* Place bits 125-127 of vB in sh. */
+ get_avr64(avrB, VB, false);
+ tcg_gen_andi_i64(sh, avrB, 0x07ULL);
+
+ /*
+ * Save highest sh bits of lower doubleword element of vA in variable
+ * shifted and perform shift on lower doubleword.
+ */
+ get_avr64(avrA, VA, false);
+ tcg_gen_subi_i64(tmp, sh, 64);
+ tcg_gen_neg_i64(tmp, tmp);
+ tcg_gen_shr_i64(shifted, avrA, tmp);
+ tcg_gen_shl_i64(avrA, avrA, sh);
+ set_avr64(VT, avrA, false);
+
+ /*
+ * Perform shift on higher doubleword element of vA and replace lowest
+ * sh bits with shifted.
+ */
+ get_avr64(avrA, VA, true);
+ tcg_gen_shl_i64(avrA, avrA, sh);
+ tcg_gen_or_i64(avrA, avrA, shifted);
+ set_avr64(VT, avrA, true);
+
+ tcg_temp_free_i64(avrA);
+ tcg_temp_free_i64(avrB);
+ tcg_temp_free_i64(sh);
+ tcg_temp_free_i64(shifted);
+ tcg_temp_free_i64(tmp);
+}
+
+/*
+ * vsr VRT,VRA,VRB - Vector Shift Right
+ *
+ * Shifting right 128 bit value of vA by value specified in bits 125-127 of vB.
+ * Lowest 3 bits in each byte element of register vB must be identical or
+ * result is undefined.
+ */
+static void trans_vsr(DisasContext *ctx)
+{
+ int VT = rD(ctx->opcode);
+ int VA = rA(ctx->opcode);
+ int VB = rB(ctx->opcode);
+ TCGv_i64 avrA = tcg_temp_new_i64();
+ TCGv_i64 avrB = tcg_temp_new_i64();
+ TCGv_i64 sh = tcg_temp_new_i64();
+ TCGv_i64 shifted = tcg_temp_new_i64();
+ TCGv_i64 tmp = tcg_temp_new_i64();
+
+ /* Place bits 125-127 of vB in sh. */
+ get_avr64(avrB, VB, false);
+ tcg_gen_andi_i64(sh, avrB, 0x07ULL);
+
+ /*
+ * Save lowest sh bits of higher doubleword element of vA in variable
+ * shifted and perform shift on higher doubleword.
+ */
+ get_avr64(avrA, VA, true);
+ tcg_gen_subi_i64(tmp, sh, 64);
+ tcg_gen_neg_i64(tmp, tmp);
+ tcg_gen_shl_i64(shifted, avrA, tmp);
+ tcg_gen_shr_i64(avrA, avrA, sh);
+ set_avr64(VT, avrA, true);
+ /*
+ * Perform shift on lower doubleword element of vA and replace highest
+ * sh bits with shifted.
+ */
+ get_avr64(avrA, VA, false);
+ tcg_gen_shr_i64(avrA, avrA, sh);
+ tcg_gen_or_i64(avrA, avrA, shifted);
+ set_avr64(VT, avrA, false);
+
+ tcg_temp_free_i64(avrA);
+ tcg_temp_free_i64(avrB);
+ tcg_temp_free_i64(sh);
+ tcg_temp_free_i64(shifted);
+ tcg_temp_free_i64(tmp);
+}
+
GEN_VXFORM(vmuloub, 4, 0);
GEN_VXFORM(vmulouh, 4, 1);
GEN_VXFORM(vmulouw, 4, 2);
@@ -699,11 +796,11 @@ GEN_VXFORM(vrld, 2, 3);
GEN_VXFORM(vrldmi, 2, 3);
GEN_VXFORM_DUAL(vrld, PPC_NONE, PPC2_ALTIVEC_207, \
vrldmi, PPC_NONE, PPC2_ISA300)
-GEN_VXFORM(vsl, 2, 7);
+GEN_VXFORM_TRANS(vsl, 2, 7);
GEN_VXFORM(vrldnm, 2, 7);
GEN_VXFORM_DUAL(vsl, PPC_ALTIVEC, PPC_NONE, \
vrldnm, PPC_NONE, PPC2_ISA300)
-GEN_VXFORM(vsr, 2, 11);
+GEN_VXFORM_TRANS(vsr, 2, 11);
GEN_VXFORM_ENV(vpkuhum, 7, 0);
GEN_VXFORM_ENV(vpkuwum, 7, 1);
GEN_VXFORM_ENV(vpkudum, 7, 17);
--
2.7.4
- [Qemu-devel] [PATCH 0/8] Optimize emulation of ten Altivec instructions: lvsl,, Stefan Brankovic, 2019/06/06
- [Qemu-devel] [PATCH 1/8] target/ppc: Optimize emulation of lvsl and lvsr instructions, Stefan Brankovic, 2019/06/06
- [Qemu-devel] [PATCH 2/8] target/ppc: Optimize emulation of vsl and vsr instructions,
Stefan Brankovic <=
- [Qemu-devel] [PATCH 4/8] target/ppc: Optimize emulation of vgbbd instruction, Stefan Brankovic, 2019/06/06
- [Qemu-devel] [PATCH 7/8] target/ppc: Optimize emulation of vclzh and vclzb instructions, Stefan Brankovic, 2019/06/06
- [Qemu-devel] [PATCH 3/8] target/ppc: Optimize emulation of vpkpx instruction, Stefan Brankovic, 2019/06/06
- [Qemu-devel] [PATCH 5/8] target/ppc: Optimize emulation of vclzd instruction, Stefan Brankovic, 2019/06/06