[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [QEMU PATCH v5 5/6] migration: spapr: migrate ccs_list in
From: |
David Gibson |
Subject: |
Re: [Qemu-ppc] [QEMU PATCH v5 5/6] migration: spapr: migrate ccs_list in spapr state |
Date: |
Mon, 10 Oct 2016 16:05:14 +1100 |
User-agent: |
Mutt/1.7.0 (2016-08-17) |
On Fri, Oct 07, 2016 at 09:52:51AM -0500, Michael Roth wrote:
> Quoting David Gibson (2016-10-06 22:36:07)
> > On Mon, Oct 03, 2016 at 11:24:56AM -0700, Jianjun Duan wrote:
> > > ccs_list in spapr state maintains the device tree related
> > > information on the rtas side for hotplugged devices. In racing
> > > situations between hotplug events and migration operation, a rtas
> > > hotplug event could be migrated from the source guest to target
> > > guest, or the source guest could have not yet finished fetching
> > > the device tree when migration is started, the target will try
> > > to finish fetching the device tree. By migrating ccs_list, the
> > > target can fetch the device tree properly.
> > >
> > > ccs_list is put in a subsection in the spapr state VMSD to make
> > > sure migration across different versions is not broken.
> > >
> > > Signed-off-by: Jianjun Duan <address@hidden>
> >
> > I'm still not entirely convinced we need to migrate the ccs_list.
> > What would happen if we did this:
> >
> > * Keep a flag which indicates whether the guest is in the middle of
> > the configure_connector process.
> > - I'm not sure if that would need to be a new bit of state, or
> > if we could deduce it from the value of the isolation and
> > allocation states
> > - If it's new state, we'd need to migrate it, obviously not if
> > we can derive it from other state flags
> >
> > * On the destination during post_load, if there was an in-progress
> > configure_connector on the source, we set another "stale
> > configure" flag
> >
> > * When a configure_connector call is attempted on the destination
> > with the stale configure flag set, return an error
> >
> > The question is, if we choose the right error, can we get the guest to
> > either restart the configure from scratch, or fail gracefully, so the
> > operator can restart the hotplug
>
> To get the configure to restart, the guest's configure_connector
> implementation would need to changed. Current code in drmgr would just
> bail on any error, and I'd imagine the in-kernel version does the same.
>
> So at least for existing guests, the only option is failing the command
> at the operator's interface, namely device_add. device_add is
> asynchronous to the actual hotplug event handling however. So if we want
> to convey failure to the user, it would have to be either in the form of
> a new QMP event emitted to convey hotplug success or error, or through
> device_add itself by implementing something like the async QMP rework
> that Marc-Andre posted some time back (which still seems to be a topic
> of debate). Which either approach, something like libvirt, with adequate
> state-tracking for pending hotplug events, could handle an error event
> on the target side post-migrate and convey that to the user somehow.
>
> That's probably a much larger discussion if it comes to that, but doable
> in theory.
>
> But even that wouldn't get us totally out of the woods: DRC state can
> still be modified outside of hotplug. For instance, a guest should be
> able to do:
>
> drmgr -c pci -s <drc_index> -r
> drmgr -c pci -s <drc_index> -a
>
> to return a device to firmware and then later take it back and
> reconfigure it. I'm not aware of any common case where this would occur,
> but it's not disallowed by the specification, and performing a migration
> between these 2 operations would currently break this since the default
> coldplug state on target assumes a configured state in source.
Thanks, that's a good case for why we need this. Can you please fold
this description into your commit messages so it's there for posterity.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
- Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 3/6] migration: extend VMStateInfo, (continued)
- Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 3/6] migration: extend VMStateInfo, Jianjun Duan, 2016/10/12
- Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 3/6] migration: extend VMStateInfo, Paolo Bonzini, 2016/10/13
- Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 3/6] migration: extend VMStateInfo, Halil Pasic, 2016/10/13
- Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 3/6] migration: extend VMStateInfo, Paolo Bonzini, 2016/10/13
- Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 3/6] migration: extend VMStateInfo, Jianjun Duan, 2016/10/13
- Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 3/6] migration: extend VMStateInfo, Halil Pasic, 2016/10/13
- Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 3/6] migration: extend VMStateInfo, Jianjun Duan, 2016/10/13
[Qemu-ppc] [QEMU PATCH v5 5/6] migration: spapr: migrate ccs_list in spapr state, Jianjun Duan, 2016/10/03
[Qemu-ppc] [QEMU PATCH v5 6/6] migration: spapr: migrate pending_events of spapr state, Jianjun Duan, 2016/10/03
Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 0/6] migration: ensure hotplug and migration work together, no-reply, 2016/10/03
Re: [Qemu-ppc] [Qemu-devel] [QEMU PATCH v5 0/6] migration: ensure hotplug and migration work together, no-reply, 2016/10/03
Re: [Qemu-ppc] [QEMU PATCH v5 0/6] migration: ensure hotplug and migration work together, Jianjun Duan, 2016/10/03