[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] migration: vectorize is_dup_page

From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH] migration: vectorize is_dup_page
Date: Tue, 20 Dec 2011 08:13:25 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20110922 Lightning/1.0b2 Thunderbird/3.1.15

On 12/06/2011 11:25 AM, Paolo Bonzini wrote:
is_dup_page is already proceeding in 32-bit chunks.  Changing it to 16
bytes using Altivec or SSE is easy, and provides a noticeable improvement.
Pierre Riteau measured 30->25 seconds on a 16GB guest, I measured 4.6->3.9
seconds on a 6GB guest (best of three times for me; dunno for Pierre).
Both of them are approximately a 15% improvement.

I tried playing with non-temporal prefetches, but I did not get any
improvement (though I did get less cache misses, so the patch was doing
its job).

Signed-off-by: Paolo Bonzini<address@hidden>
  arch_init.c |   28 ++++++++++++++++++++++------
  1 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index cdad805..473df2d 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -94,14 +94,30 @@ const uint32_t arch_type = QEMU_ARCH;
  #define RAM_SAVE_FLAG_EOS      0x10

-static int is_dup_page(uint8_t *page, uint8_t ch)
+#if __ALTIVEC__

I think you want #ifdefs here and possibly below:

  CC    x86_64-softmmu/arch_init.o
cc1: warnings being treated as errors
/home/anthony/git/qemu/arch_init.c:97:5: error: "__ALTIVEC__" is not defined
/home/anthony/git/qemu/arch_init.c: In function ‘is_dup_page’:
/home/anthony/git/qemu/arch_init.c:116:5: error: incompatible type for argument 1 of ‘_mm_set1_epi8’ /usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/include/emmintrin.h:636:1: note: expected ‘char’ but argument is of type ‘__m128i’


Anthony Liguori

+#define VECTYPE        vector unsigned char
+#define SPLAT(p)       vec_splat(vec_ld(0, p), 0)
+#define ALL_EQ(v1, v2) vec_all_eq(v1, v2)
+#elif __SSE2__
+#define VECTYPE        __m128i
+#define SPLAT(p)       _mm_set1_epi8(*(p))
+#define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) == 0xFFFF)
+#define VECTYPE        unsigned long
+#define SPLAT(p)       (*(p) * (~0UL / 255))
+#define ALL_EQ(v1, v2) ((v1) == (v2))
+static int is_dup_page(uint8_t *page)
-    uint32_t val = ch<<  24 | ch<<  16 | ch<<  8 | ch;
-    uint32_t *array = (uint32_t *)page;
+    VECTYPE *p = (VECTYPE *)page;
+    VECTYPE val = SPLAT(p);
      int i;

-    for (i = 0; i<  (TARGET_PAGE_SIZE / 4); i++) {
-        if (array[i] != val) {
+    for (i = 0; i<  TARGET_PAGE_SIZE / sizeof(VECTYPE); i++) {
+        if (!ALL_EQ(val, p[i])) {
              return 0;
@@ -136,7 +152,7 @@ static int ram_save_block(QEMUFile *f)

              p = block->host + offset;

-            if (is_dup_page(p, *p)) {
+            if (is_dup_page(p)) {
                  qemu_put_be64(f, offset | cont | RAM_SAVE_FLAG_COMPRESS);
                  if (!cont) {
                      qemu_put_byte(f, strlen(block->idstr));

reply via email to

[Prev in Thread] Current Thread [Next in Thread]