[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [Qemu-devel] [FIX PATCH] spapr_drc: Return correct state
From: |
David Gibson |
Subject: |
Re: [Qemu-ppc] [Qemu-devel] [FIX PATCH] spapr_drc: Return correct state for logical DR in entity_sense() |
Date: |
Wed, 9 Sep 2015 14:13:28 +1000 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Tue, Sep 08, 2015 at 05:03:25PM -0500, Michael Roth wrote:
> Quoting Michael Roth (2015-09-08 16:03:56)
> > Quoting David Gibson (2015-09-07 20:22:50)
> > > On Mon, Sep 07, 2015 at 11:37:04AM +0530, Bharata B Rao wrote:
> > > > When drmgr is run in the guest to add a device for which device_add
> > > > hasn't been issued in QEMU, configure-connector call fails.
> > > > When configure-connector call fails, the guest would release (*)
> > > > the previously acquired DRC by setting back the DRC isolation state
> > > > to ISOLATED and allocation state to UNUSABLE. These calls will be issued
> > > > only if get-sensor-state call returns PRESENT state. However currently
> > > > for
> > > > a logical DR, entity_sense() would unconditinally return UNUSABLE
> > > > state only. This prevents any subsequent hotplug of the device with
> > > > that DRC.
> >
> > This seems a little odd. I think we default to ALLOCATION_STATE_UNUSABLE
> > for logical DR, and it's up the guest to transition to USABLE, which
> > probably happens prior to the configure-connector calls. So I think the
> > net effect of this fix is that guest will see these unallocated/unattached
> > resources the same way they would a resource that was actually attached
> > via device_add, and all we're really doing is working around the
> > eventual configuration failure that that will lead to by pretending a
> > resource was actually there.
> >
> > According to PAPR+ 2.7:
> >
> > 13.7.3.1 Acquire Logical Resource from Resource Pool:
> >
> > If the state is “unusable” the OS issues set-indicator (allocation-state,
> > usable) to attempt to allocate the re-
> > source. Similarly, if the state is “available for exchange” the OS issues
> > set-indicator (allocation-state, ex-
> > change) to attempt to allocate the resource, and if the state is
> > “available for recovery” the OS issues
> > set-indicator (allocation-state, recover) to attempt to allocate the
> > resource.
> >
> > and
> >
> > 13.7 Logical Resource Dynamic Reconfiguration (LRDR):
> >
> > The OS may use the get-sensor-state RTAS call with the dr-entity-sense
> > token to deter-
> > mine if a given drc-index refers to a connector that is currently usable
> > for DR operations. If the connector is not
> > currently usable the return state is “DR entity unusable” (2). A
> > set-indicator (isolation state) RTAS call to an unusable
> > connector or (dr-indicator) to any logical resource connector results in a
> > “No such indicator implemented” return sta-
> > tus.
> >
> > So I think maybe the proper fix is to make sure that
> > drc->set_indicator_state() fails with an error that indicates to RTAS to
> > return NO_SENSOR (-3) for cases where we haven't attached a resource
> > to the DRC via device_add.
>
> Patch incoming:
>
> spapr_drc: don't allow 'empty' DRCs to be unisolated
>
> applies to spapr-next but requires revert of this patch.
>
> Bharata, can you give it a spin with CPU hotplug and see if it fixes the
> issue you hit?
>
> >
> > Which also kind of re-opens the discussion of whether or not
> > drc->set_indicator_state() should return RTAS errors directly. I'd
> > still stray away from that for now but maybe if we get more cases
> > like this it'll start becoming more practical.
>
> I'm starting to second-guess myself on this. I'm trying to maintain
> separation between RTAS/DRC but result is a bit pathological. Feel free
> to comment in the above patch.
To my mind the set_indicator path is pretty much inherently specific
to PAPR's weird way of doing things. For that reason I think it's
just going to be simplest for it to return PAPR defined errors -
i.e. RTAS errors.
The whole thing is so tied to PAPR, I don't think it makes sense to
try to use generic error codes for it. Trying to separate "DRC" error
codes from the RTAS codes they result in sounds like a translation
layer for no real gain.
> Just FYI: original author names appear to have gotten lost in recent
> spapr-next rebase.
Ah, thanks for catching that. Should be fixed now.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
pgpndKl607s1Y.pgp
Description: PGP signature