[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [PATCH for-2.8 0/3] spapr: fix breakage of memory unplug
From: |
David Gibson |
Subject: |
Re: [Qemu-ppc] [PATCH for-2.8 0/3] spapr: fix breakage of memory unplug after migration |
Date: |
Mon, 21 Nov 2016 10:58:56 +1100 |
User-agent: |
Mutt/1.7.1 (2016-10-04) |
On Fri, Nov 18, 2016 at 10:39:49AM -0600, Michael Roth wrote:
> Quoting David Gibson (2016-11-17 23:45:05)
> > On Thu, Nov 17, 2016 at 07:40:24PM -0600, Michael Roth wrote:
> > > These patches are based on David's ppc-for-2.8 tree, and are also
> > > available from:
> > >
> > > https://github.com/mdroth/qemu/commits/spapr-cas-migration
> > >
> > > Currently, memory hotplugged to a pseries guest cannot be removed after
> > > the guest has been migrated. This is due to 2 issues:
> > >
> > > 1) The coldplugged state of memory on the target side is one where the
> > > corresponding DRC's allocation state is:
> > >
> > > allocation_state == unallocated,
> > > awaiting_allocation == true,
> > >
> > > When the guest attempts to unplug memory on the target side, it first
> > > checks that allocation_state == allocated. If we fix this, the guest
> > > can successfully notify QEMU of completion on it's end, but then the
> > > DRC code sees that awaiting_allocation == true, so it defers the
> > > finalizing of the LMB and corresponding DIMM since it assumes that
> > > the DIMM must have been previously allocated before it can be removed.
> > >
> > > To address this, we pull in patches 1-2 from Jian Jun's DRC migration
> > > series:
> > >
> > > https://lists.gnu.org/archive/html/qemu-ppc/2016-10/msg00048.html
> > >
> > > with some minor changes relating to prior review comments, and
> > > the addition of migrating the DRC's awaiting_allocation value, which
> > > wasn't part of the original patch. This doesn't address the full scope
> > > of the issues Jian Jun was looking at (involving synchronizing state
> > > when migration occurs during fairly small race windows), just this
> > > particular case, which is more user visible since the time window is
> > > indefinite.
> > >
> > > 2) The ability to unplug memory is gated on the QEMU side by a check as
> > > to whether or not support for newer-style hotplug events was negotiated
> > > via CAS during boot. The check is performed by checking the
> > > corresponding
> > > entry in the sPAPROptionVector structure. However, since this value
> > > isn't
> > > migrated currently, we are unable to unplug until after the guest
> > > reboots.
> > >
> > > We address that here by adding migration support for
> > > sPAPROptionVectors,
> > > and including the CAS-negotiated vector as part of the migration stream
> > > for any cases where we advertise newer-style hotplug event support to
> > > the guest.
> > >
> > > David,
> > >
> > > These fixes ended up going out much later than planned. I'm not sure
> > > if you're planning another pull for 2.8 or not, and realize there are
> > > some patches here not specifically pseries-related so it's
> > > understandable if we opt to pursue these for 2.9/2.8.1 instead. But if
> > > possible I'm hoping to get these in so that the memory unplug
> > > support is fully functional for 2.8.
> >
> > Yeah, I'm still expecting to push a few bugfixes in before 2.8. So,
> > I've merged these patches into ppc-for-2.8 (fixing a couple of trivial
> > style nits along the way). I have a couple of comments that I'll make
> > on the patches, but they're not important enough to stop these going
> > in ASAP.
> >
> > Unfortunately, of course, this is not the only migration breakage we
> > have at the moment. I'm presently wrestling with both breakage due to
> > changes in the insns_flags masks, and due to the reworking of the mmio
> > windows for the PHB.
>
> Ok, thanks for the heads up. FYI I'm still hoping to get the insns_flags
> fix in for 2.7.1 (which is a bit behind at this point, should have schedule
> and initial tree posted next week though), so I will keep an eye out for
> those.
Yeah, in addition to being sick, I've had to rethink how to fix these
migration problems, including how to address this for 2.7.1. I'm
working on it right now.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
- [Qemu-ppc] [PATCH for-2.8 1/3] migration: alternative way to set instance_id in SaveStateEntry, (continued)