[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 5/6] tests/qtest: massively speed up migration-tet
From: |
Juan Quintela |
Subject: |
Re: [PATCH v2 5/6] tests/qtest: massively speed up migration-tet |
Date: |
Sat, 22 Apr 2023 00:15:34 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Daniel P. Berrangé <berrange@redhat.com> wrote:
> The migration test cases that actually exercise live migration want to
> ensure there is a minimum of two iterations of pre-copy, in order to
> exercise the dirty tracking code.
>
> Historically we've queried the migration status, looking for the
> 'dirty-sync-count' value to increment to track iterations. This was
> not entirely reliable because often all the data would get transferred
> quickly enough that the migration would finish before we wanted it
> to. So we massively dropped the bandwidth and max downtime to
> guarantee non-convergance. This had the unfortunate side effect
> that every migration took at least 30 seconds to run (100 MB of
> dirty pages / 3 MB/sec).
>
> This optimization takes a different approach to ensuring that a
> mimimum of two iterations. Rather than waiting for dirty-sync-count
> to increment, directly look for an indication that the source VM
> has dirtied RAM that has already been transferred.
>
> On the source VM a magic marker is written just after the 3 MB
> offset. The destination VM is now montiored to detect when the
> magic marker is transferred. This gives a guarantee that the
> first 3 MB of memory have been transferred. Now the source VM
> memory is monitored at exactly the 3MB offset until we observe
> a flip in its value. This gives us a guaranteed that the guest
> workload has dirtied a byte that has already been transferred.
>
> Since we're looking at a place that is only 3 MB from the start
> of memory, with the 3 MB/sec bandwidth, this test should complete
> in 1 second, instead of 30 seconds.
>
> Once we've proved there is some dirty memory, migration can be
> set back to full speed for the remainder of the 1st iteration,
> and the entire of the second iteration at which point migration
> should be complete.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Hi
I think this is not enough. As said before:
- xbzrle needs 3 iterations
- auto converge needs around 12 iterations (forgot) the exact number,
but it is a lot.
- for (almost) all the rest of the tests, we don't really care, we just
need the migration to finish.
One easy way to "test" it is: Change the "meaning" of ZERO downtime to
mean that we don't want to enter the completion stage, just continue
sending data.
Changig this in qemu:
modified migration/migration.c
@@ -2726,6 +2726,9 @@ static MigIterateState
migration_iteration_run(MigrationState *s)
trace_migrate_pending_estimate(pending_size, must_precopy, can_postcopy);
+ if (s->threshold_size == 0) {
+ return MIG_ITERATE_RESUME;
+ }
if (must_precopy <= s->threshold_size) {
qemu_savevm_state_pending_exact(&must_precopy, &can_postcopy);
pending_size = must_precopy + can_postcopy;
And just setting the downtime to zero should be enough.
It is too late, so before I start with this, what do you think?
Later, Juan.
- Re: [PATCH v2 3/6] tests/qtest: capture RESUME events during migration, (continued)
[PATCH v2 5/6] tests/qtest: massively speed up migration-tet, Daniel P . Berrangé, 2023/04/21
- Re: [PATCH v2 5/6] tests/qtest: massively speed up migration-tet,
Juan Quintela <=
[PATCH v2 6/6] tests/migration: Only run auto_converge in slow mode, Daniel P . Berrangé, 2023/04/21
RE: [PATCH v2 6/6] tests/migration: Only run auto_converge in slow mode, Zhang, Chen, 2023/04/24
[PATCH v2 4/6] tests/qtest: make more migration pre-copy scenarios run non-live, Daniel P . Berrangé, 2023/04/21