qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 5/x] ppc: Convert op_load_gpr_{T0, T1, T2} to TC


From: Andreas Färber
Subject: Re: [Qemu-devel] [PATCH 5/x] ppc: Convert op_load_gpr_{T0, T1, T2} to TCG
Date: Wed, 3 Sep 2008 18:04:00 +0200


Am 03.09.2008 um 14:58 schrieb Andreas Färber:

Am 03.09.2008 um 14:41 schrieb Aurélien Jarno:

This optimization has been done with dyngen in mind, we surely don't
want to keep it with TCG.

I currently have plenty of time, but almost no network (travelling), so I'll work on implementing a solution, and commit the result most probably
tomorrow morning.

I already have a solution cooking for cpu_T64 and SPE, including improved TCGv setup. It'll be no problem for me to incorporate your and Thiemo's comments in my revised patch.

Here's a draft of what I've come up with so far:

- Based on your comments, make the GPRs always 32-bit for ppc and 64- bit for ppc64. - Introduce cpu_T64[0..2], for ppc. (required a change in cpu.h for 64- bit host)
- Introduce cpu_gprh[0..31] in addition to cpu_gpr[0..31], for ppc only.
- Use sprintf for cpu_gpr, cpu_gprh names.

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 834c08d..c7291ed 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -33,17 +33,7 @@ typedef uint64_t ppc_gpr_t;

 #else /* defined (TARGET_PPC64) */
 /* PowerPC 32 definitions */
-#if (HOST_LONG_BITS >= 64)
-/* When using 64 bits temporary registers,
- * we can use 64 bits GPR with no extra cost
- * It's even an optimization as this will prevent
- * the compiler to do unuseful masking in the micro-ops.
- */
-typedef uint64_t ppc_gpr_t;
-#else /* (HOST_LONG_BITS >= 64) */
 typedef uint32_t ppc_gpr_t;
-#endif /* (HOST_LONG_BITS >= 64) */
-
 #define TARGET_LONG_BITS 32

 #if defined(TARGET_PPCEMB)
@@ -541,7 +531,7 @@ struct CPUPPCState {
     /* First are the most commonly used resources
      * during translated code execution
      */
-#if (HOST_LONG_BITS == 32)
+#if (TARGET_LONG_BITS > HOST_LONG_BITS) || !defined(TARGET_PPC64)
     /* temporary fixed-point registers
      * used to emulate 64 bits registers on 32 bits hosts
      */
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 9068936..c8ecfe2 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -44,15 +44,37 @@
/ *****************************************************************************/ /* Code translation helpers */

-static TCGv cpu_env, cpu_T[3];
+/* global register indexes */
+static TCGv cpu_env;
+static char cpu_reg_names[10*3 + 22*4
+#if !defined(TARGET_PPC64)
+    + 10*4 + 22*5
+#endif
+];
+static TCGv cpu_gpr[32];
+#if !defined(TARGET_PPC64)
+static TCGv cpu_gprh[32];
+#endif
+
+/* dyngen register indexes */
+static TCGv cpu_T[3];
+#if defined(TARGET_PPC64)
+#define cpu_T64 cpu_T
+#else
+static TCGv cpu_T64[3];
+#endif

 #include "gen-icount.h"

 void ppc_translate_init(void)
 {
+    int i;
+    char* p;
     static int done_init = 0;
+
     if (done_init)
         return;
+
     cpu_env = tcg_global_reg_new(TCG_TYPE_PTR, TCG_AREG0, "env");
 #if TARGET_LONG_BITS > HOST_LONG_BITS
     cpu_T[0] = tcg_global_mem_new(TCG_TYPE_TL,
@@ -66,6 +88,31 @@ void ppc_translate_init(void)
     cpu_T[1] = tcg_global_reg_new(TCG_TYPE_TL, TCG_AREG2, "T1");
     cpu_T[2] = tcg_global_reg_new(TCG_TYPE_TL, TCG_AREG3, "T2");
 #endif
+#if !defined(TARGET_PPC64)
+    cpu_T64[0] = tcg_global_mem_new(TCG_TYPE_I64,
+                                    TCG_AREG0, offsetof(CPUState, t0),
+                                    "T0_64");
+    cpu_T64[1] = tcg_global_mem_new(TCG_TYPE_I64,
+                                    TCG_AREG0, offsetof(CPUState, t1),
+                                    "T1_64");
+    cpu_T64[2] = tcg_global_mem_new(TCG_TYPE_I64,
+                                    TCG_AREG0, offsetof(CPUState, t2),
+                                    "T2_64");
+#endif
+
+    p = cpu_reg_names;
+    for (i = 0; i < 32; i++) {
+        sprintf(p, "r%d", i);
+        cpu_gpr[i] = tcg_global_mem_new(TCG_TYPE_TL,
+            TCG_AREG0, offsetof(CPUState, gpr[i]), p);
+        p += (i < 10) ? 3 : 4;
+#if !defined(TARGET_PPC64)
+        sprintf(p, "r%dH", i);
+        cpu_gprh[i] = tcg_global_mem_new(TCG_TYPE_I32,
+            TCG_AREG0, offsetof(CPUState, gprh[i]), p);
+        p += (i < 10) ? 4 : 5;
+#endif
+    }

     /* register helpers */
 #undef DEF_HELPER
<snip>

SPE will be rarely used anyway I guess, so I think it's relatively "safe" for me to remove the dyngen optimization and simplify ppc_gpr_t during this transition.

What I meant was that the ppc_gpr_t change still lets ppc-softmmu and ppc-linux-user compile, and to my understanding none of these use SPE by default at runtime.

I did not want to comment on whether they are fully implemented in QEMU or widely used in external code somewhere.

As long as SPE compiles and isn't irrevocably damaged, we're good I think. We'll have to go through and remove the cpu_T stuff later and can optimize SPE with, e.g., local i64 temporaries then.

Andreas





reply via email to

[Prev in Thread] Current Thread [Next in Thread]