[Qemu-ppc] [PATCH v3 0/9] target/ppc: prepare for conversion to TCG vect

From: Mark Cave-Ayland
Subject: [Qemu-ppc] [PATCH v3 0/9] target/ppc: prepare for conversion to TCG vector operations
Date: Thu, 20 Dec 2018 16:31:14 +0000

This patchset is an attempt at trying to improve the VMX (Altivec) instruction
performance by laying the groundwork for use of the new TCG vector operations.

Patches 1 and 2 fix a sign-extension error discovered in EXTRACT_SHELPER and an
associated typo in the SIMM5 macro which were discovered whilst testing 
follow-on TCG vector improvements patchset.

In order to use TCG vector operations, the registers must be accessible from 
whilst currently they are accessed via arrays of static TCG globals. Patches 3-5
are therefore mechanical patches which introduce access helpers for FPR, AVR 
and VSR
registers using the supplied TCGv_i64 parameter.

Once this is done, patch 6 enables us to remove the static TCG global arrays 
and updates
the access helpers to read/write to the relevant fields in cpu_env directly.

Patches 7 and 8 perform the legwork required to enable VSX instructions to be 
to use TCG vector operations in future by rearranging the FP, VMX and VSX 
registers into
a single aligned VSR register array (the scope of this patchset is VMX only).

Patch 9 removes the AVR* macros and replaces them with the corresponding Vsr* 
since they are equivalent.

Finally thanks to Richard for taking the time to answer some of my (mostly 
questions related to TCG.

Signed-off-by: Mark Cave-Ayland <address@hidden>

- Rebase onto master, drop RFC prefix, alter subject line
- Add A-B tags from David
- Add SIMM5/EXTRACT_HELPER macro fix patches to the start of the series
- Drop patch 4 from previous patchset (delay AVR register writeback) as it 
  not be required.
- Remove extra get_fpr() accidentally added to GEN_FLOAT macros in patch 3
- Fix temporary leak when VMX/VSX not enabled in patches 4 and 5
- Add patch to remove AVR* macros, replacing them with Vsr* macros
- Drop patches converting logical, add and sub instructions to TCG vector ops 
  Richard incorporate this into his TCG vector improvements patchset)

- Rebase onto master
- Add comment explaining rationale for FPR helpers in description for patch 1
- Add R-B tags from Richard
- Add patch 3 to delay AVR register writeback as spotted by Richard
- Add patches 6 and 7 to merge FPR, VMX and VSX registers into the vsr array
  to facilitate conversion of VSX instructions to vector operations later
- Fix accidental bug whereby the conversion of get_vsr()/set_vsr() to access
  data from cpu_env was incorrectly squashed into patch 3
- Move set_fpr() further down in gen_fsqrts() and gen_frsqrtes() in patch 1

Mark Cave-Ayland (9):
  target/ppc: fix typo in SIMM5 extraction helper
  target/ppc: switch EXTRACT_HELPER macros over to use
  target/ppc: introduce get_fpr() and set_fpr() helpers for FP register
  target/ppc: introduce get_avr64() and set_avr64() helpers for VMX
    register access
  target/ppc: introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}()
    helpers for VSR register access
  target/ppc: switch FPR, VMX and VSX helpers to access data directly
    from cpu_env
  target/ppc: merge ppc_vsr_t and ppc_avr_t union types
  target/ppc: move FP and VMX registers into aligned vsr register array
  target/ppc: replace AVR* macros with Vsr* macros

 linux-user/ppc/signal.c             |  24 +-
 target/ppc/arch_dump.c              |  12 +-
 target/ppc/cpu.h                    |  26 +-
 target/ppc/gdbstub.c                |   8 +-
 target/ppc/int_helper.c             |  94 ++--
 target/ppc/internal.h               |  43 +-
 target/ppc/machine.c                |  72 ++-
 target/ppc/monitor.c                |   4 +-
 target/ppc/translate.c              |  73 ++-
 target/ppc/translate/dfp-impl.inc.c |   2 +-
 target/ppc/translate/fp-impl.inc.c  | 486 +++++++++++++++-----
 target/ppc/translate/vmx-impl.inc.c | 154 +++++--
 target/ppc/translate/vsx-impl.inc.c | 862 ++++++++++++++++++++++++++----------
 target/ppc/translate_init.inc.c     |  24 +-
 14 files changed, 1339 insertions(+), 545 deletions(-)


