qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] memory: Fix qemu crash on starting dirty log twice with stop


From: Paolo Bonzini
Subject: Re: [PATCH] memory: Fix qemu crash on starting dirty log twice with stopped VM
Date: Mon, 7 Feb 2022 10:08:44 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0

On 2/7/22 04:26, Peter Xu wrote:
QEMU can now easily crash with two continuous migration carried out:

(qemu) migrate -d exec:cat>out
(qemu) migrate_cancel
(qemu) migrate -d exec:cat>out
[crash] ../softmmu/memory.c:2782: memory_global_dirty_log_start: Assertion
`!(global_dirty_tracking & flags)' failed.

It's because memory API provides a way to postpone dirty log stop if the VM is
stopped, and that'll be re-done until the next VM start.  It was added in 2017
with commit 1931076077 ("migration: optimize the downtime", 2017-08-01).

However the recent work on allowing dirty tracking to be bitmask broke it,
which is commit 63b41db4bc ("memory: make global_dirty_tracking a bitmask",
2021-11-01).

The fix proposed in this patch contains two things:

   (1) Instead of passing over the flags to postpone stop dirty track, we add a
       global variable (along with current vmstate_change variable) to record
       what flags to stop dirty tracking.

   (2) When start dirty tracking, instead if remove the vmstate hook directly,
       we also execute the postponed stop process so that we make sure all the
       starts and stops will be paired.

This procedure is overlooked in the bitmask-ify work in 2021.

Actually with the postponed stop flags, a smarter thing to do is when start
dirty logging with specific flag, we can ignore the start if the flag is
contained in the "postponed to stop" flags, however that'll slightly complicate
the code, and it shouldn't be a major use case for QEMU.  Considering that this
should also copy stable, keep it simple while fixing the crash.

Cc: Hyman Huang <huangy81@chinatelecom.cn>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2044818
Fixes: 63b41db4bc ("memory: make global_dirty_tracking a bitmask")
Signed-off-by: Peter Xu <peterx@redhat.com>
---
  softmmu/memory.c | 53 ++++++++++++++++++++++++++++++++++--------------
  1 file changed, 38 insertions(+), 15 deletions(-)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 678dc62f06..cbb9b241ea 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -2790,16 +2790,25 @@ void memory_global_after_dirty_log_sync(void)
      MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward);
  }
+/*
+ * Dirty track stop flags that are postponed due to VM being stopped.  Should
+ * only be used within vmstate_change hook.
+ */
+static unsigned int postponed_stop_flags;
  static VMChangeStateEntry *vmstate_change;
+static void memory_global_dirty_log_stop_postponed_run(void);
void memory_global_dirty_log_start(unsigned int flags)
  {
      unsigned int old_flags = global_dirty_tracking;
- if (vmstate_change) {
-        qemu_del_vm_change_state_handler(vmstate_change);
-        vmstate_change = NULL;
-    }
+    /*
+     * If there is postponed dirty log stop flags, do it, so that start/stop
+     * will always be paired.  A smarter thing to do is ignore start process if
+     * the same flag has got postponed on stop, but it shouldn't matter a lot
+     * in major user scenarios, so keep the code simple for now.
+     */
+    memory_global_dirty_log_stop_postponed_run();

I think this should be as easy as doing here:

    postponed_stop_flags &= ~flags;
    memory_global_dirty_log_stop_postponed_run();
    flags &= ~global_dirty_tracking_flags;
    if (!flags) {
        return;
    }

plus adding an "if (postponed_stop_flags)" in memory_global_dirty_log_stop_postponed_run?

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]