qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/6] RfC: try improve native hotplug for pcie root ports


From: Daniel P . Berrangé
Subject: Re: [PATCH 0/6] RfC: try improve native hotplug for pcie root ports
Date: Thu, 11 Nov 2021 18:08:11 +0000
User-agent: Mutt/2.0.7 (2021-05-04)

On Thu, Nov 11, 2021 at 12:11:19PM -0500, Michael S. Tsirkin wrote:
> On Thu, Nov 11, 2021 at 09:35:37AM +0000, Daniel P. Berrangé wrote:
> > On Thu, Nov 11, 2021 at 03:20:07AM -0500, Michael S. Tsirkin wrote:
> > > On Thu, Nov 11, 2021 at 08:53:06AM +0100, Gerd Hoffmann wrote:
> > > >   Hi,
> > > > 
> > > > > Given it's a bugfix, and given that I hear through internal channels
> > > > > that QE results so far have been encouraging, I am inclined to bite 
> > > > > the
> > > > > bullet and merge this for -rc1.
> > > > 
> > > > Fine with me.
> > > > 
> > > > > I don't think this conflicts with Julia's patches as users can still
> > > > > disable ACPI hotplug into bridges.  Gerd, agree?  Worth the risk?
> > > > 
> > > > Combining this with Julia's patches looks a bit risky to me and surely
> > > > needs testing.  I expect the problematic case is both native and acpi
> > > > hotplug being enabled.
> > > >  When the guest uses acpi hotplug nobody will
> > > > turn on slot power on the pcie root port ...
> > > 
> > > I'm not sure I understand what the situation is, and how to trigger it.
> > > Could you clarify pls?
> > > 
> > > > I'd suggest to just revert to pcie native hotplug for 6.2.
> > > 
> > > Hmm that kind of change seems even riskier to me. I think I'll try with
> > > Igor's patches.
> > 
> > Why would it be risky ? PCIE native hotplug is what we've used in
> > QEMU for years & years, until 6.1 enabled the buggy ACPI hotplug.
> > The behaviour of the current PCIE native hotplug impl is a known
> > quantity.
> > 
> > Regards,
> > Daniel
> 
> For example we might regress some of the bugs that the switch to ACPI fixed 
> back to
> 6.0 state. There were a bunch of these. For example it should be
> possible for guests to disable native and use ACPI then, but isn't.

Of course there were bugs fixed by switching to ACPI, but we'd
lived with native hotplug in production and the majority of
the time it worked as users needed. The bugs were edge cases
essentially only affecting a small subset of users.

The switch to ACPI broke the out of the box configuration for
used by OpenStack. That's not an edge case, that's a serious
impact.

> I'm very willing to consider the switch back to native by default
> but given the timing doing big changes like that at the last
> minute seems unusual.

I consider it to be fixing a serious regression by going back
to a known working safe impl, that has been used in production
successfully for a long time. We know there are bugs with
native hotplug, but they're *known* problems.

The unsual thing about timing is having a major functional
regression identified in the previous release and then not
even having patches propposed to fix it until after soft
freeze for the next release arrives :-(

It doesn't give a feeling of confidence, but makes me
wonder what other *unknown* problems we're liable to hit
still.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]