[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: -only-migrate and the two different uses of migration blockers
From: |
David Gibson |
Subject: |
Re: -only-migrate and the two different uses of migration blockers |
Date: |
Sun, 25 Jul 2021 16:25:53 +1000 |
On Thu, Jul 22, 2021 at 07:00:56PM +0100, Dr. David Alan Gilbert wrote:
> * David Gibson (david@gibson.dropbear.id.au) wrote:
> > On Tue, Jul 20, 2021 at 07:30:16AM +0200, Markus Armbruster wrote:
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> > >
> > > > * Markus Armbruster (armbru@redhat.com) wrote:
> > > >> We appear to use migration blockers in two ways:
> > > >>
> > > >> (1) Prevent migration for an indefinite time, typically due to use of
> > > >> some feature that isn't compatible with migration.
> > > >>
> > > >> (2) Delay migration for a short time.
> > > >>
> > > >> Option -only-migrate is designed for (1). It interferes with (2).
> > > >>
> > > >> Example for (1): device "x-pci-proxy-dev" doesn't support migration.
> > > >> It
> > > >> adds a migration blocker on realize, and deletes it on unrealize. With
> > > >> -only-migrate, device realize fails. Works as designed.
> > > >>
> > > >> Example for (2): spapr_mce_req_event() makes an effort to prevent
> > > >> migration degrate the reporting of FWNMIs. It adds a migration blocker
> > > >> when it receives one, and deletes it when it's done handling it. This
> > > >> is a best effort; if migration is already in progress by the time FWNMI
> > > >> is received, we simply carry on, and that's okay. However, option
> > > >> -only-migrate sabotages the best effort entirely.
> > > >
> > > > That's interesting; it's the first time I've heard of anyone using it as
> > > > 'best effort'. I've always regarded blockers as blocking.
> > >
> > > Me too, until I found this one.
> >
> > Right, it may well have been the first usage this way, this fwnmi
> > stuff isn't super old.
> >
> > > >> While this isn't exactly terrible, it may be a weakness in our thinking
> > > >> and our infrastructure. I'm bringing it up so the people in charge are
> > > >> aware :)
> > > >
> > > > Thanks.
> > > >
> > > > It almost feels like they need a way to temporarily hold off
> > > > 'completion' of migratio - i.e. the phase where we stop the CPU and
> > > > write the device data; mind you you'd also probably want it to stop
> > > > cold-migrates/snapshots?
> > >
> > > Yes, a proper way to delay 'completion' for a bit would be clearer, and
> > > wouldn't let -only-migrate interfere.
> >
> > Right. If that becomes a thing, we should use it here. Note that
> > this one use case probably isn't a very strong argument for it,
> > though. The only problem here is slightly less that optimal error
> > reporting in a rare edge case (hardware fault occurs by chance at the
> > same time as a migration).
>
> Can you at least put a scary comment in to say why it's so odd.
>
> If you wanted a choice of a different bad way to do this, since you have
> savevm_htab_handlers, you might be able to make htab_save_iterate claim
> there's always more to do.
That would only work if the hash MMU is in use, which won't be the
case with most current systems.
> > .... and, also, I half-suspect that the whole fwnmi feature exists
> > more to tick IBM RAS check boxes than because anyone will actually use
> > it.
>
> Ah at least it's always reliable....
>
> Dave
>
>
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
- spapr_events: Sure we may ignore migrate_add_blocker() failure?, Markus Armbruster, 2021/07/15
- Re: spapr_events: Sure we may ignore migrate_add_blocker() failure?, David Gibson, 2021/07/18
- Re: spapr_events: Sure we may ignore migrate_add_blocker() failure?, Markus Armbruster, 2021/07/19
- Re: spapr_events: Sure we may ignore migrate_add_blocker() failure?, David Gibson, 2021/07/19
- Re: spapr_events: Sure we may ignore migrate_add_blocker() failure?, Markus Armbruster, 2021/07/19
- -only-migrate and the two different uses of migration blockers (was: spapr_events: Sure we may ignore migrate_add_blocker() failure?), Markus Armbruster, 2021/07/19
- Re: -only-migrate and the two different uses of migration blockers (was: spapr_events: Sure we may ignore migrate_add_blocker() failure?), Dr. David Alan Gilbert, 2021/07/19
- Re: -only-migrate and the two different uses of migration blockers, Markus Armbruster, 2021/07/20
- Re: -only-migrate and the two different uses of migration blockers, David Gibson, 2021/07/21
- Re: -only-migrate and the two different uses of migration blockers, Dr. David Alan Gilbert, 2021/07/22
- Re: -only-migrate and the two different uses of migration blockers,
David Gibson <=
- Re: spapr_events: Sure we may ignore migrate_add_blocker() failure?, David Gibson, 2021/07/21