[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is
From: |
Alex Williamson |
Subject: |
Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported |
Date: |
Tue, 17 May 2022 10:00:45 -0600 |
On Mon, 16 May 2022 20:08:32 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:
> On Mon, May 16, 2022 at 02:22:00PM -0600, Alex Williamson wrote:
> > On Mon, 16 May 2022 13:22:14 +0200
> > Juan Quintela <quintela@redhat.com> wrote:
> >
> > > Avihai Horon <avihaih@nvidia.com> wrote:
> > > > Currently, if IOMMU of a VFIO container doesn't support dirty page
> > > > tracking, migration is blocked completely. This is because a DMA-able
> > > > VFIO device can dirty RAM pages without updating QEMU about it, thus
> > > > breaking the migration.
> > > >
> > > > However, this doesn't mean that migration can't be done at all. If
> > > > migration pre-copy phase is skipped, the VFIO device doesn't have a
> > > > chance to dirty RAM pages that have been migrated already, thus
> > > > eliminating the problem previously mentioned.
> > > >
> > > > Hence, in such case allow migration but skip pre-copy phase.
> > > >
> > > > Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> > >
> > > I don't know (TM).
> > > Several issues:
> > > - Patch is ugly as hell (ok, that depends on taste)
> > > - It changes migration_iteration_run() instead of directly
> > > migration_thread.
> > > - There is already another case where we skip the sending of RAM
> > > (localhost migration with shared memory)
> > >
> > > In migration/ram.c:
> > >
> > > static int ram_find_and_save_block(RAMState *rs, bool last_stage)
> > > {
> > > PageSearchStatus pss;
> > > int pages = 0;
> > > bool again, found;
> > >
> > > /* No dirty page as there is zero RAM */
> > > if (!ram_bytes_total()) {
> > > return pages;
> > > }
> > >
> > > This is the other place where we _don't_ send any RAM at all.
> > >
> > > I don't have a great idea about how to make things clear at a higher
> > > level, I have to think about this.
> >
> > It seems like if we have devices dictating what type of migrations can
> > be performed then there probably needs to be a switch to restrict use of
> > such devices just as we have the -only-migratable switch now to prevent
> > attaching devices that don't support migration. I'd guess that we need
> > the switch to opt-in to allowing such devices to maintain
> > compatibility. There's probably a whole pile of qapi things missing to
> > expose this to management tools as well. Thanks,
>
> This is really intended to be a NOP from where things are now, as if
> you use mlx5 live migration without a patch like this then it causes a
> botched pre-copy since everything just ends up permanently dirty.
>
> If it makes more sense we could abort the pre-copy too - in the end
> there will be dirty tracking so I don't know if I'd invest in a big
> adventure to fully define non-dirty tracking migration.
How is pre-copy currently "botched" without a patch like this? If it's
simply that the pre-copy doesn't converge and the downtime constraints
don't allow the VM to enter stop-and-copy, that's the expected behavior
AIUI, and supports backwards compatibility with existing SLAs.
I'm assuming that by setting this new skip_precopy flag that we're
forcing the VM to move to stop-and-copy, regardless of any other SLA
constraints placed on the migration. It seems like a better solution
would be to expose to management tools that the VM contains a device
that does not support the pre-copy phase so that downtime expectations
can be adjusted. Thanks,
Alex
- [PATCH 2/9] vfio: Fix compilation errors caused by VFIO migration v1 deprecation, (continued)
[PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Avihai Horon, 2022/05/12
- Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Juan Quintela, 2022/05/16
- Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Alex Williamson, 2022/05/16
- Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Jason Gunthorpe, 2022/05/16
- Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported,
Alex Williamson <=
- Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Jason Gunthorpe, 2022/05/17
- Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Alex Williamson, 2022/05/17
- Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Jason Gunthorpe, 2022/05/17
- Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Alex Williamson, 2022/05/17
- Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Jason Gunthorpe, 2022/05/18
Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Juan Quintela, 2022/05/18
Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Jason Gunthorpe, 2022/05/18
Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Avihai Horon, 2022/05/24
Re: [PATCH 4/9] vfio/migration: Skip pre-copy if dirty page tracking is not supported, Joao Martins, 2022/05/20