qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] sh : performance problem


From: Shin-ichiro KAWASAKI
Subject: Re: [Qemu-devel] sh : performance problem
Date: Thu, 05 Mar 2009 00:12:58 +0900
User-agent: Thunderbird 2.0.0.19 (Windows/20081209)

Lionel Landwerlin wrote:
Le mercredi 04 mars 2009 à 00:46 +0900, Shin-ichiro KAWASAKI a écrit :
Lionel Landwerlin wrote:
Shin-ichiro,

Sorry, but I cannot apply your patch cleanly on the last qemu-svn.

Instead, I would like to try another approach. The patch you proposed to
find (or not) a valid TLB entry has a complexity of O(log2(n)) (or
something like that if I remember) instead here is a patch with a
complexity of O(1).
Good work.  I evaluated your patch on my environment, measuring
compile time for empty main() with gcc.

  sh4 : 5.8 [seconds]     O(n) utlb search.
  sh4 : 4.6 [seconds]     O(log2(n)) utlb search.
  sh4 : 4.1 [seconds]     O(1) utlb search by Lionel
  arm : 0.8 [seconds]     (-M versatilepb + Debian ARM)

Your patch has a nice score!

Now I've done the work to increase number of utlb entries from 64 to 256,
and found that the score get arround 2.4 seconds.
I'm trying to increase it to 4096.  Your O(1) search will be more important
as the entry number increase.

We should setup a git repository to improve that or at least work on the
same basis.

Do you have a lot of patch on top of the last svn ?

I personally manages my staging repository corresponds to 'qemu-sh'.

 git://github.com/kawasaki/qemu-sh.git
 http://github.com/kawasaki/qemu-sh/tree/master

Patches in it can be applied to current qemu subversion trunk.
I put your O(1) patch and my utlb entry increasing patch also.
Will it work as the basis?

I attach the utlb entry increasing patch to this mail for reference.
It should be applied with other patches in my staging repository, or
patch command will fail.
It increases the number of utlb entry from 64 to 256.
URC and URB values in MMUCR can take the number from 0 to 255.

Here's the benchmark value.

  sh4 : 5.8 [seconds]     O(n) utlb search.
  sh4 : 4.6 [seconds]     O(log2(n)) utlb search.
  sh4 : 4.1 [seconds]     O(1) utlb search by Lionel
sh4 : 2.1 [seconds] O(1) utlb search, and 256 utlb enries <= arm : 0.8 [seconds] (-M versatilepb + Debian ARM)

Getting better :) but still behind arm.

Regards,
Shin-ichiro KAWASAKI


Index: trunk/target-sh4/cpu.h
===================================================================
--- trunk/target-sh4/cpu.h      (revision 6676)
+++ trunk/target-sh4/cpu.h      (working copy)
@@ -88,7 +88,8 @@
    uint8_t tc;                 /* timing control */
} tlb_t;

-#define UTLB_SIZE 64
+#define UTLB_BITS 8     /* real hard : 6 */
+#define UTLB_SIZE (1 << UTLB_BITS)
#define ITLB_SIZE 4

#define NB_MMU_MODES 2
@@ -228,14 +229,18 @@
#define MMUCR    0x1F000010
#define MMUCR_AT (1<<0)
#define MMUCR_SV (1<<8)
-#define MMUCR_URC_BITS (6)
#define MMUCR_URC_OFFSET (10)
-#define MMUCR_URC_SIZE (1 << MMUCR_URC_BITS)
-#define MMUCR_URC_MASK (((MMUCR_URC_SIZE) - 1) << MMUCR_URC_OFFSET)
+#define MMUCR_URC_MASK ((UTLB_SIZE - 1) << MMUCR_URC_OFFSET)
static inline int cpu_mmucr_urc (uint32_t mmucr)
{
    return ((mmucr & MMUCR_URC_MASK) >> MMUCR_URC_OFFSET);
}
+#define MMUCR_URB_OFFSET (18)
+#define MMUCR_URB_MASK ((UTLB_SIZE - 1) << MMUCR_URB_OFFSET)
+static inline int cpu_mmucr_urb (uint32_t mmucr)
+{
+    return ((mmucr & MMUCR_URB_MASK) >> MMUCR_URB_OFFSET);
+}

/* PTEH : Page Translation Entry High register */
#define PTEH_ASID_BITS (8)
Index: trunk/target-sh4/helper.c
===================================================================
--- trunk/target-sh4/helper.c   (revision 6676)
+++ trunk/target-sh4/helper.c   (work copy)
@@ -346,12 +346,12 @@
    uint8_t urb, urc;

    /* Increment URC */
-    urb = ((env->mmucr) >> 18) & 0x3f;
-    urc = ((env->mmucr) >> 10) & 0x3f;
+    urb = cpu_mmucr_urb(env->mmucr);
+    urc = cpu_mmucr_urc(env->mmucr);
    urc++;
    if ((urb > 0 && urc > urb) || urc > (UTLB_SIZE - 1))
-       urc = 0;
-    env->mmucr = (env->mmucr & 0xffff03ff) | (urc << 10);
+        urc = 0;
+    env->mmucr = (env->mmucr & ~MMUCR_URC_MASK) | (urc << MMUCR_URC_OFFSET);
}

/* Find itlb entry - update itlb from utlb if necessary and asked for




reply via email to

[Prev in Thread] Current Thread [Next in Thread]