[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 4/6] tests/qtest: make more migration pre-copy scenarios r
From: |
Daniel P . Berrangé |
Subject: |
Re: [PATCH v2 4/6] tests/qtest: make more migration pre-copy scenarios run non-live |
Date: |
Fri, 26 May 2023 18:58:45 +0100 |
User-agent: |
Mutt/2.2.9 (2022-11-12) |
On Mon, Apr 24, 2023 at 06:01:36PM -0300, Fabiano Rosas wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
>
> > There are 27 pre-copy live migration scenarios being tested. In all of
> > these we force non-convergance and run for one iteration, then let it
> > converge and wait for completion during the second (or following)
> > iterations. At 3 mbps bandwidth limit the first iteration takes a very
> > long time (~30 seconds).
> >
> > While it is important to test the migration passes and convergance
> > logic, it is overkill to do this for all 27 pre-copy scenarios. The
> > TLS migration scenarios in particular are merely exercising different
> > code paths during connection establishment.
> >
> > To optimize time taken, switch most of the test scenarios to run
> > non-live (ie guest CPUs paused) with no bandwidth limits. This gives
> > a massive speed up for most of the test scenarios.
> >
> > For test coverage the following scenarios are unchanged
> >
> > * Precopy with UNIX sockets
> > * Precopy with UNIX sockets and dirty ring tracking
> > * Precopy with XBZRLE
> > * Precopy with multifd
> >
> > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > ---
> > tests/qtest/migration-test.c | 60 ++++++++++++++++++++++++++++++------
> > 1 file changed, 50 insertions(+), 10 deletions(-)
> >
> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > index 6492ffa7fe..40d0f75480 100644
> > --- a/tests/qtest/migration-test.c
> > +++ b/tests/qtest/migration-test.c
> > @@ -568,6 +568,9 @@ typedef struct {
> > MIG_TEST_FAIL_DEST_QUIT_ERR,
> > } result;
> >
> > + /* Whether the guest CPUs should be running during migration */
> > + bool live;
> > +
> > /* Postcopy specific fields */
> > void *postcopy_data;
> > bool postcopy_preempt;
> > @@ -1324,8 +1327,6 @@ static void test_precopy_common(MigrateCommon *args)
> > return;
> > }
> >
> > - migrate_ensure_non_converge(from);
> > -
> > if (args->start_hook) {
> > data_hook = args->start_hook(from, to);
> > }
> > @@ -1335,6 +1336,31 @@ static void test_precopy_common(MigrateCommon *args)
> > wait_for_serial("src_serial");
> > }
> >
> > + if (args->live) {
> > + /*
> > + * Testing live migration, we want to ensure that some
> > + * memory is re-dirtied after being transferred, so that
> > + * we exercise logic for dirty page handling. We achieve
> > + * this with a ridiculosly low bandwidth that guarantees
> > + * non-convergance.
> > + */
> > + migrate_ensure_non_converge(from);
> > + } else {
> > + /*
> > + * Testing non-live migration, we allow it to run at
> > + * full speed to ensure short test case duration.
> > + * For tests expected to fail, we don't need to
> > + * change anything.
> > + */
> > + if (args->result == MIG_TEST_SUCCEED) {
> > + qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
> > + if (!got_stop) {
> > + qtest_qmp_eventwait(from, "STOP");
> > + }
> > + migrate_ensure_converge(from);
> > + }
> > + }
> > +
> > if (!args->connect_uri) {
> > g_autofree char *local_connect_uri =
> > migrate_get_socket_address(to, "socket-address");
> > @@ -1352,19 +1378,29 @@ static void test_precopy_common(MigrateCommon *args)
> > qtest_set_expected_status(to, EXIT_FAILURE);
> > }
> > } else {
> > - wait_for_migration_pass(from);
> > + if (args->live) {
> > + wait_for_migration_pass(from);
> >
> > - migrate_ensure_converge(from);
> > + migrate_ensure_converge(from);
> >
> > - /* We do this first, as it has a timeout to stop us
> > - * hanging forever if migration didn't converge */
> > - wait_for_migration_complete(from);
> > + /*
> > + * We do this first, as it has a timeout to stop us
> > + * hanging forever if migration didn't converge
> > + */
> > + wait_for_migration_complete(from);
> > +
> > + if (!got_stop) {
> > + qtest_qmp_eventwait(from, "STOP");
> > + }
> > + } else {
> > + wait_for_migration_complete(from);
> >
> > - if (!got_stop) {
> > - qtest_qmp_eventwait(from, "STOP");
> > + qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
>
> I retested and the problem still persists. The issue is with this wait +
> cont sequence:
>
> wait_for_migration_complete(from);
> qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
>
> We wait for the source to finish but by the time qmp_cont executes, the
> dst is still INMIGRATE, autostart gets set and I never see the RESUME
> event.
This is ultimately caused by the broken logic in the previous
patch 3 that looked for RESUME. The loooking for the STOP would
discard all non-STOP events, which includes the RESUME event
we were just about to look for. I've had to completely change
the event handling in migration-helpers and libqtest to fix this.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
- Re: [PATCH v2 4/6] tests/qtest: make more migration pre-copy scenarios run non-live,
Daniel P . Berrangé <=