qemu-trivial
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-trivial] [PATCH] arch_init.c: remove duplicate function


From: Michael R. Hines
Subject: Re: [Qemu-trivial] [PATCH] arch_init.c: remove duplicate function
Date: Tue, 15 Apr 2014 07:55:48 +0800
User-agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

On 04/14/2014 05:19 PM, Laszlo Ersek wrote:
On 04/14/14 04:27, Amos Kong wrote:
We already have a function buffer_is_zero() in util/cutils.c

Signed-off-by: Amos Kong <address@hidden>
---
  arch_init.c | 9 ++-------
  1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 60c975d..342e5dc 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -152,11 +152,6 @@ int qemu_read_default_config_files(bool userconfig)
      return 0;
  }
-static inline bool is_zero_range(uint8_t *p, uint64_t size)
-{
-    return buffer_find_nonzero_offset(p, size) == size;
-}
-
  /* struct contains XBZRLE cache and a static page
     used by the compression */
  static struct {
@@ -587,7 +582,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
                          acct_info.dup_pages++;
                      }
                  }
-            } else if (is_zero_range(p, TARGET_PAGE_SIZE)) {
+            } else if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {
                  acct_info.dup_pages++;
                  bytes_sent = save_block_hdr(f, block, offset, cont,
                                              RAM_SAVE_FLAG_COMPRESS);
@@ -983,7 +978,7 @@ static inline void *host_from_stream_offset(QEMUFile *f,
   */
  void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
  {
-    if (ch != 0 || !is_zero_range(host, size)) {
+    if (ch != 0 || !buffer_is_zero(host, size)) {
          memset(host, ch, size);
      }
  }

This seems to be correct, functionally -- apparently buffer_is_zero()
has laxer size requirements than buffer_find_nonzero_offset(). However,
I think the latter might be faster.

For ram_save_block() I guess the difference is negligible. But
ram_handle_compressed() is also called from "migration-rdma.c", where I
can't even guess if a little bit of slowdown would count.

I'm OK with the patch if Michael (CC'd) is.

Thanks
Laszlo


Thanks for the CC.

Actually, it looks like buffer_is_zero() is calling buffer_find_nonzero_offset()
as a "first try" anyway - which is the same thing RDMA is doing. So, all
calls to ram_handle_compressed() should hit the branch target there in
buffer_is_zero() for can_use_buffer_find_nonzero_offset() and automatically
drop into the vectorized-optimization to search for zeros, so there shouldn't
be any change in performance. The same should apply for TCP migration
as well - it's working on page-granularity, which is always aligned to 32 or 64 bits.

Paolo? I see that some of the block-migration code and the qemu-img code is also
calling buffer_is_zero() - are you guys depending on the performance of any
buffer_is_zero() calls to use the vector-optimized version like migration does?

Just for RDMA: Reviewed-by: Michael R. Hines <address@hidden>

- Michael




reply via email to

[Prev in Thread] Current Thread [Next in Thread]