qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] block/mirror: add 'write-blocking-after-ready' copy mode


From: Denis V. Lunev
Subject: Re: [PATCH] block/mirror: add 'write-blocking-after-ready' copy mode
Date: Thu, 2 Feb 2023 12:09:22 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 2/2/23 11:19, Fiona Ebner wrote:
Am 31.01.23 um 19:18 schrieb Denis V. Lunev:
On 1/31/23 18:44, Vladimir Sementsov-Ogievskiy wrote:
+ Den

Den, I remember we thought about that, and probably had a solution?

Another possible approach to get benefits from both modes is to switch
to blocking mode after first loop of copying. [*]

Hmm. Thinking about proposed solution it seems, that [*] is better.
The main reason of "write-blocking" mode is to force convergence of
the mirror job. But you lose this thing if you postpone switching to
the moment when mirror becomes READY: we may never reach it.



Or, may be we can selectively skip or block guest writes, to keep some
specified level of convergence?

Or, we can start in "background" mode, track the speed of convergence
(like dirty-delta per minute), and switch to blocking if speed becomes
less than some threshold.

That is absolutely correct. It would be very kind to start with
asynchronous mirror and switch to the synchronous one after
specific amount of loops. This should be somehow measured.

Originally we have thought about switching after the one loop
(when basic background sending is done), but may be 2 iterations
(initial + first one) would be better.

Talking about this specific approach and our past experience
I should note that reaching completion point in the
asynchronous mode was a real pain and slow down for the guest.
We have intentionally switched at that time to the sync mode
for that purpose. Thus this patch is a "worst half-way" for
us.
While the asynchronous mode can take longer of course, the slowdown of
the guest is bigger with the synchronous mode, or what am I missing?

We have been using asynchronous mode for years (starting migration after
all drive-mirror jobs are READY) and AFAIK our users did not complain
about them not reaching READY or heavy slowdowns in the guest. We had
two reports about an assertion failure recently when drive-mirror still
got work left at the time the blockdrives are inactivated by migration.
We have tests internally which performs migration with different
IO write rates during the process. We use such kind of tests
more than 10 years and before we have switched to QEMU
from previous proprietary hypervisor.

Once the switch happened, we have started to experience
failures here due to asynchronous nature of the mirror. That
is why we have raised initial discussion and this code have
appeared. I may be incorrect or toooooo self oriented here,
but this could be my understanding of the situation :)

Frankly speaking I would say that this switch could be considered
NOT QEMU job and we should just send a notification (event) for the
completion of the each iteration and management software should
take a decision to switch from async mode to the sync one.
Unfortunately, our management software is a bit limited in that regard
currently and making listening for events available in the necessary
place would take a bit of work. Having the switch command would nearly
be enough for us (we'd just switch after READY). But we'd also need that
when the switch happens after READY, that all remaining asynchronous
operations are finished by the command.
That could be a matter of the other event I believe. We switch mode and reset the state. New READY event will be sent once the bitmap is cleared. That seems
fair.
  Otherwise, the original issue
with inactivating block drives while mirror still has work remains. Do
those semantics for the switch sound acceptable?

Under such a condition we should have a command to perform mode
switch. This seems more QEMU/QAPI way :)

I feel like a switch at an arbitrary time might be hard to do correctly,
i.e. to not miss some (implicit) assertion.

But I can try and go for this approach if it's more likely to be
accepted upstream. So a 'drive-mirror-change' command which takes the
'job-id' and a 'copy-mode' argument? And only allow the switch from
'background' to 'active' (initially)?
That is what I can not guarantee as we have Kevin and Stefan here. Though
in general this opens interesting possibilities for other use-cases :)

Or a more generic 'block-job-change'? But that would need a way to take
'JobType'-specific arguments. Is that doable?

(...)

@@ -1035,10 +1036,31 @@ static int coroutine_fn mirror_run(Job *job,
Error **errp)
           if (s->in_flight == 0 && cnt == 0) {
               trace_mirror_before_flush(s);
               if (!job_is_ready(&s->common.job)) {
+                if (s->copy_mode ==
+ MIRROR_COPY_MODE_WRITE_BLOCKING_AFTER_READY) {
+                    /*
+                     * Pause guest I/O to check if we can switch to
active mode.
+                     * To set actively_synced to true below, it is
necessary to
+                     * have seen and synced all guest I/O.
+                     */
+                    s->in_drain = true;
+                    bdrv_drained_begin(bs);
+                    if (bdrv_get_dirty_count(s->dirty_bitmap) > 0) {
+                        bdrv_drained_end(bs);
+                        s->in_drain = false;
+                        continue;
+                    }
+                    s->in_active_mode = true;
+ bdrv_disable_dirty_bitmap(s->dirty_bitmap);
+                    bdrv_drained_end(bs);
+                    s->in_drain = false;
+                }
I doubt, do we really need the drained section?
Frankly speaking I do not like this. I'd better would not
rely on the enable/disable of the whole bitmap but encode
mode into MirrorOp and mark blocks dirty only for those
operations for which in_active_mode was set at the
operation start.

Do you mean "for which in_active_mode was *not* set at the operation
start"? Also, the dirty bitmap gets set by things like the
bdrv_co_pwritev() call in bdrv_mirror_top_do_write(), so how would we
prevent setting it? The dirty bitmap does get reset in
do_sync_target_write(), so maybe that already takes care of it, but I
didn't check in detail.

I thought about something like this

iris ~/src/qemu $ git diff
diff --git a/block/mirror.c b/block/mirror.c
index 634815d78d..9cf5f884ee 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -230,7 +230,9 @@ static void coroutine_fn mirror_write_complete(MirrorOp *op, int ret)
     if (ret < 0) {
         BlockErrorAction action;

-        bdrv_set_dirty_bitmap(s->dirty_bitmap, op->offset, op->bytes);
+        if (op->need_dirty) {
+            bdrv_set_dirty_bitmap(s->dirty_bitmap, op->offset, op->bytes);
+        }
         action = mirror_error_action(s, false, -ret);
         if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
             s->ret = ret;
@@ -437,6 +439,7 @@ static unsigned mirror_perform(MirrorBlockJob *s, int64_t offset,
         .offset         = offset,
         .bytes          = bytes,
         .bytes_handled  = &bytes_handled,
+        .need_dirty     = s->job->copy_mode != MIRROR_COPY_MODE_WRITE_BLOCKING;
     };
     qemu_co_queue_init(&op->waiting_requests);

iris ~/src/qemu $

Den



reply via email to

[Prev in Thread] Current Thread [Next in Thread]