qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] dirty page count problem


From: Dr. David Alan Gilbert
Subject: [Qemu-devel] dirty page count problem
Date: Fri, 21 Jul 2017 18:28:33 +0100
User-agent: Mutt/1.8.3 (2017-05-23)

Hi,
  Git bisect is pointing to your patch 084140bd49:
  exec: fix access to ram_list.dirty_memory when sync dirty bitmap

trying to diagnose a bug I'm seeing; it looks like the dirty page count
is wrong for some reason.

Alex Bennée spotted a problem where the postcopy test would occasionally
fail under very heavy load;    attaching a debugger and it looks like
the problem is we have a migration_dirty_page count stuck at 2;
in the normal migration tests we don't spot this, because 2 pages is
smaller than the threshold to end migration and so an extra 2 pages
doesn't block it finishing.   However, with a very
small downtime setting (like we use in the postcopy test) and with
very low bandwidth (as when Alex ran the test on a very heavily loaded
machine) we end up never calling the bitmap sync again and never
completing the iteration.

I'm using the following addition to spot the problem:

diff --git a/migration/ram.c b/migration/ram.c
index e75f1050e4..3ddf884952 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1350,6 +1350,13 @@ static int ram_find_and_save_block(RAMState *rs, bool 
last_stage)
         }
     } while (!pages && again);

+    if (!pages && !again && pss.complete_round && rs->migration_dirty_pages)
+    {
+        /* Should make this fail migration ? */
+        fprintf(stderr, "%s: no page found, yet dirty_pages=%"PRIu64"\n",
+                __func__, rs->migration_dirty_pages);
+    }
+
     rs->last_seen_block = pss.block;
     rs->last_page = pss.page;

(which I might add as a test to fail a migration)

That test fails easily even on an unloaded machine:
tests/postcopy-test
/x86_64/postcopy: ram_find_and_save_block: no page found, yet dirty_pages=2
ram_find_and_save_block: no page found, yet dirty_pages=2
ram_find_and_save_block: no page found, yet dirty_pages=2
OK


I'll try and debug where our extra two pages are coming from.

Dave
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]