qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 5/6] tests/qtest: massively speed up migration-tet


From: Juan Quintela
Subject: Re: [PATCH v2 5/6] tests/qtest: massively speed up migration-tet
Date: Sat, 22 Apr 2023 00:15:34 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Daniel P. Berrangé <berrange@redhat.com> wrote:
> The migration test cases that actually exercise live migration want to
> ensure there is a minimum of two iterations of pre-copy, in order to
> exercise the dirty tracking code.
>
> Historically we've queried the migration status, looking for the
> 'dirty-sync-count' value to increment to track iterations. This was
> not entirely reliable because often all the data would get transferred
> quickly enough that the migration would finish before we wanted it
> to. So we massively dropped the bandwidth and max downtime to
> guarantee non-convergance. This had the unfortunate side effect
> that every migration took at least 30 seconds to run (100 MB of
> dirty pages / 3 MB/sec).
>
> This optimization takes a different approach to ensuring that a
> mimimum of two iterations. Rather than waiting for dirty-sync-count
> to increment, directly look for an indication that the source VM
> has dirtied RAM that has already been transferred.
>
> On the source VM a magic marker is written just after the 3 MB
> offset. The destination VM is now montiored to detect when the
> magic marker is transferred. This gives a guarantee that the
> first 3 MB of memory have been transferred. Now the source VM
> memory is monitored at exactly the 3MB offset until we observe
> a flip in its value. This gives us a guaranteed that the guest
> workload has dirtied a byte that has already been transferred.
>
> Since we're looking at a place that is only 3 MB from the start
> of memory, with the 3 MB/sec bandwidth, this test should complete
> in 1 second, instead of 30 seconds.
>
> Once we've proved there is some dirty memory, migration can be
> set back to full speed for the remainder of the 1st iteration,
> and the entire of the second iteration at which point migration
> should be complete.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

Hi

I think this is not enough.  As said before:
- xbzrle needs 3 iterations
- auto converge needs around 12 iterations (forgot) the exact number,
  but it is a lot.
- for (almost) all the rest of the tests, we don't really care, we just
  need the migration to finish.

One easy way to "test" it is: Change the "meaning" of ZERO downtime to
mean that we don't want to enter the completion stage, just continue
sending data.

Changig this in qemu:

modified   migration/migration.c
@@ -2726,6 +2726,9 @@ static MigIterateState 
migration_iteration_run(MigrationState *s)
 
     trace_migrate_pending_estimate(pending_size, must_precopy, can_postcopy);
 
+    if (s->threshold_size == 0) {
+        return MIG_ITERATE_RESUME;
+    }
     if (must_precopy <= s->threshold_size) {
         qemu_savevm_state_pending_exact(&must_precopy, &can_postcopy);
         pending_size = must_precopy + can_postcopy;

And just setting the downtime to zero should be enough.

It is too late, so before I start with this, what do you think?

Later, Juan.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]