qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [Qemu-devel] [FIX PATCH] spapr_drc: Return correct state


From: David Gibson
Subject: Re: [Qemu-ppc] [Qemu-devel] [FIX PATCH] spapr_drc: Return correct state for logical DR in entity_sense()
Date: Wed, 9 Sep 2015 14:13:28 +1000
User-agent: Mutt/1.5.23 (2014-03-12)

On Tue, Sep 08, 2015 at 05:03:25PM -0500, Michael Roth wrote:
> Quoting Michael Roth (2015-09-08 16:03:56)
> > Quoting David Gibson (2015-09-07 20:22:50)
> > > On Mon, Sep 07, 2015 at 11:37:04AM +0530, Bharata B Rao wrote:
> > > > When drmgr is run in the guest to add a device for which device_add
> > > > hasn't been issued in QEMU, configure-connector call fails.
> > > > When configure-connector call fails, the guest would release (*)
> > > > the previously acquired DRC by setting back the DRC isolation state
> > > > to ISOLATED and allocation state to UNUSABLE. These calls will be issued
> > > > only if get-sensor-state call returns PRESENT state. However currently 
> > > > for
> > > > a logical DR, entity_sense() would unconditinally return UNUSABLE
> > > > state only. This prevents any subsequent hotplug of the device with
> > > > that DRC.
> > 
> > This seems a little odd. I think we default to ALLOCATION_STATE_UNUSABLE
> > for logical DR, and it's up the guest to transition to USABLE, which
> > probably happens prior to the configure-connector calls. So I think the
> > net effect of this fix is that guest will see these unallocated/unattached
> > resources the same way they would a resource that was actually attached
> > via device_add, and all we're really doing is working around the
> > eventual configuration failure that that will lead to by pretending a
> > resource was actually there.
> > 
> > According to PAPR+ 2.7:
> > 
> > 13.7.3.1 Acquire Logical Resource from Resource Pool:
> > 
> >   If the state is “unusable” the OS issues set-indicator (allocation-state, 
> > usable) to attempt to allocate the re-
> >   source. Similarly, if the state is “available for exchange” the OS issues 
> > set-indicator (allocation-state, ex-
> >   change) to attempt to allocate the resource, and if the state is 
> > “available for recovery” the OS issues
> >   set-indicator (allocation-state, recover) to attempt to allocate the 
> > resource.
> > 
> > and
> > 
> > 13.7 Logical Resource Dynamic Reconfiguration (LRDR):
> > 
> >   The OS may use the get-sensor-state RTAS call with the dr-entity-sense 
> > token to deter-
> > mine if a given drc-index refers to a connector that is currently usable 
> > for DR operations. If the connector is not
> > currently usable the return state is “DR entity unusable” (2). A 
> > set-indicator (isolation state) RTAS call to an unusable
> > connector or (dr-indicator) to any logical resource connector results in a 
> > “No such indicator implemented” return sta-
> > tus.  
> > 
> > So I think maybe the proper fix is to make sure that
> > drc->set_indicator_state() fails with an error that indicates to RTAS to
> > return NO_SENSOR (-3) for cases where we haven't attached a resource
> > to the DRC via device_add.
> 
> Patch incoming:
> 
>   spapr_drc: don't allow 'empty' DRCs to be unisolated
> 
> applies to spapr-next but requires revert of this patch.
> 
> Bharata, can you give it a spin with CPU hotplug and see if it fixes the
> issue you hit?
> 
> > 
> > Which also kind of re-opens the discussion of whether or not
> > drc->set_indicator_state() should return RTAS errors directly. I'd
> > still stray away from that for now but maybe if we get more cases
> > like this it'll start becoming more practical.
> 
> I'm starting to second-guess myself on this. I'm trying to maintain
> separation between RTAS/DRC but result is a bit pathological. Feel free
> to comment in the above patch.

To my mind the set_indicator path is pretty much inherently specific
to PAPR's weird way of doing things.  For that reason I think it's
just going to be simplest for it to return PAPR defined errors -
i.e. RTAS errors.

The whole thing is so tied to PAPR, I don't think it makes sense to
try to use generic error codes for it.  Trying to separate "DRC" error
codes from the RTAS codes they result in sounds like a translation
layer for no real gain.

> Just FYI: original author names appear to have gotten lost in recent
> spapr-next rebase.

Ah, thanks for catching that.  Should be fixed now.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: pgpndKl607s1Y.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]