[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-ppc] [PATCH v3 0/9] target/ppc: prepare for conversion to TCG vect
From: |
Mark Cave-Ayland |
Subject: |
[Qemu-ppc] [PATCH v3 0/9] target/ppc: prepare for conversion to TCG vector operations |
Date: |
Thu, 20 Dec 2018 16:31:14 +0000 |
This patchset is an attempt at trying to improve the VMX (Altivec) instruction
performance by laying the groundwork for use of the new TCG vector operations.
Patches 1 and 2 fix a sign-extension error discovered in EXTRACT_SHELPER and an
associated typo in the SIMM5 macro which were discovered whilst testing
Richard's
follow-on TCG vector improvements patchset.
In order to use TCG vector operations, the registers must be accessible from
cpu_env
whilst currently they are accessed via arrays of static TCG globals. Patches 3-5
are therefore mechanical patches which introduce access helpers for FPR, AVR
and VSR
registers using the supplied TCGv_i64 parameter.
Once this is done, patch 6 enables us to remove the static TCG global arrays
and updates
the access helpers to read/write to the relevant fields in cpu_env directly.
Patches 7 and 8 perform the legwork required to enable VSX instructions to be
converted
to use TCG vector operations in future by rearranging the FP, VMX and VSX
registers into
a single aligned VSR register array (the scope of this patchset is VMX only).
Patch 9 removes the AVR* macros and replaces them with the corresponding Vsr*
macros
since they are equivalent.
Finally thanks to Richard for taking the time to answer some of my (mostly
beginner)
questions related to TCG.
Signed-off-by: Mark Cave-Ayland <address@hidden>
v3:
- Rebase onto master, drop RFC prefix, alter subject line
- Add A-B tags from David
- Add SIMM5/EXTRACT_HELPER macro fix patches to the start of the series
- Drop patch 4 from previous patchset (delay AVR register writeback) as it
should
not be required.
- Remove extra get_fpr() accidentally added to GEN_FLOAT macros in patch 3
- Fix temporary leak when VMX/VSX not enabled in patches 4 and 5
- Add patch to remove AVR* macros, replacing them with Vsr* macros
- Drop patches converting logical, add and sub instructions to TCG vector ops
(let
Richard incorporate this into his TCG vector improvements patchset)
v2:
- Rebase onto master
- Add comment explaining rationale for FPR helpers in description for patch 1
- Add R-B tags from Richard
- Add patch 3 to delay AVR register writeback as spotted by Richard
- Add patches 6 and 7 to merge FPR, VMX and VSX registers into the vsr array
to facilitate conversion of VSX instructions to vector operations later
- Fix accidental bug whereby the conversion of get_vsr()/set_vsr() to access
data from cpu_env was incorrectly squashed into patch 3
- Move set_fpr() further down in gen_fsqrts() and gen_frsqrtes() in patch 1
Mark Cave-Ayland (9):
target/ppc: fix typo in SIMM5 extraction helper
target/ppc: switch EXTRACT_HELPER macros over to use
sextract32/extract32
target/ppc: introduce get_fpr() and set_fpr() helpers for FP register
access
target/ppc: introduce get_avr64() and set_avr64() helpers for VMX
register access
target/ppc: introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}()
helpers for VSR register access
target/ppc: switch FPR, VMX and VSX helpers to access data directly
from cpu_env
target/ppc: merge ppc_vsr_t and ppc_avr_t union types
target/ppc: move FP and VMX registers into aligned vsr register array
target/ppc: replace AVR* macros with Vsr* macros
linux-user/ppc/signal.c | 24 +-
target/ppc/arch_dump.c | 12 +-
target/ppc/cpu.h | 26 +-
target/ppc/gdbstub.c | 8 +-
target/ppc/int_helper.c | 94 ++--
target/ppc/internal.h | 43 +-
target/ppc/machine.c | 72 ++-
target/ppc/monitor.c | 4 +-
target/ppc/translate.c | 73 ++-
target/ppc/translate/dfp-impl.inc.c | 2 +-
target/ppc/translate/fp-impl.inc.c | 486 +++++++++++++++-----
target/ppc/translate/vmx-impl.inc.c | 154 +++++--
target/ppc/translate/vsx-impl.inc.c | 862 ++++++++++++++++++++++++++----------
target/ppc/translate_init.inc.c | 24 +-
14 files changed, 1339 insertions(+), 545 deletions(-)
--
2.11.0