qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH for-2.6 2/3] mirror: Skip BH for mirror_exit if in m


From: Fam Zheng
Subject: [Qemu-devel] [PATCH for-2.6 2/3] mirror: Skip BH for mirror_exit if in main loop
Date: Fri, 22 Apr 2016 14:35:06 +0800

Commit 5a7e7a0ba moved mirror_exit to a BH handler but didn't add any
protection against guest requests that could sneak in before the BH is
dispatched. For example, this could happen (assuming a code base at that
commit):

        main_loop_wait # 1
          os_host_main_loop_wait
            g_main_context_dispatch
              aio_ctx_dispatch
                aio_dispatch
                  ...
                    mirror_run
                      bdrv_drain
    (a)               block_job_defer_to_main_loop
          qemu_iohandler_poll
            virtio_queue_host_notifier_read
              ...
                virtio_submit_multiwrite
    (b)           blk_aio_multiwrite

        main_loop_wait # 2
          <snip>
                aio_dispatch
                  aio_bh_poll
    (c)             mirror_exit

At (a) we know the BDS has no pending request. However, the same
main_loop_wait call is going to dispatch iohandlers (EventNotifier
events), which may lead to a new I/O from guest. So the invariant is
already broken at (c). Data loss.

Commit f3926945c8 made iohandler to use aio API.  The order of
virtio_queue_host_notifier_read and block_job_defer_to_main_loop within
a main_loop_wait becomes unpredictable, and even worse, if the host
notifier event arrives at the next main_loop_wait call, the
unpredictable order between mirror_exit and
virtio_queue_host_notifier_read is also a trouble. As shown below, this
commit made the bug easier to trigger:

    - Bug case 1:

        main_loop_wait # 1
          os_host_main_loop_wait
            g_main_context_dispatch
              aio_ctx_dispatch (qemu_aio_context)
                ...
                  mirror_run
                    bdrv_drain
    (a)             block_job_defer_to_main_loop
              aio_ctx_dispatch (iohandler_ctx)
                virtio_queue_host_notifier_read
                  ...
                    virtio_submit_multiwrite
    (b)               blk_aio_multiwrite

        main_loop_wait # 2
          ...
                aio_dispatch
                  aio_bh_poll
    (c)             mirror_exit

    - Bug case 2:

        main_loop_wait # 1
          os_host_main_loop_wait
            g_main_context_dispatch
              aio_ctx_dispatch (qemu_aio_context)
                ...
                  mirror_run
                    bdrv_drain
    (a)             block_job_defer_to_main_loop

        main_loop_wait # 2
          ...
            aio_ctx_dispatch (iohandler_ctx)
              virtio_queue_host_notifier_read
                ...
                  virtio_submit_multiwrite
    (b)             blk_aio_multiwrite
              aio_dispatch
                aio_bh_poll
    (c)           mirror_exit

In both cases, (b) breaks the invariant wanted by (a) and (c).

Unfortunately, the request loss has been silent, until 3f09bfbc7be added
an assertion at (c) to check the invariant in
bdrv_replace_in_backing_chain. Max reproduced an assertion failure after
that commit, by doing active committing while the guest is running
bonnie++.

2.5 added bdrv_drained_begin at (a) to protect the dataplane case from
similar problems, but we never realize the main loop bug until now.

As a bandage, this patch undoes the change of 5a7e7a0bad17 in
non-dataplane case, and calls mirror_replace directly in mirror_run.

The next step after 2.6 is to complete bdrv_drained_begin complete
(including both fixing the dispatching code in the main loop and marking
host event notifiers etc. as external) and then this special casing is
not necessary.

Launchpad Bug: 1570134

Cc: address@hidden
Signed-off-by: Fam Zheng <address@hidden>
---
 block/mirror.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/mirror.c b/block/mirror.c
index 6c3fe43..bc77773 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -726,6 +726,13 @@ immediate_exit:
     /* Before we switch to target in mirror_exit, make sure data doesn't
      * change. */
     bdrv_drained_begin(s->common.bs);
+    if (qemu_get_aio_context() == bdrv_get_aio_context(bs)) {
+        /* FIXME: if we are in the main loop, the iohandler doesn't honor
+         * bdrv_drained_begin yet, and guest requests can sneak in before BH
+         * callback runs. Do the replace now to avoid it. */
+        mirror_replace(s, &data->ret);
+        data->should_replace = false;
+    }
     block_job_defer_to_main_loop(&s->common, mirror_exit, data);
 }
 
-- 
2.8.0




reply via email to

[Prev in Thread] Current Thread [Next in Thread]