qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 07/15] spapr_rtas: add ibm, configure-connect


From: Michael Roth
Subject: Re: [Qemu-devel] [PATCH v6 07/15] spapr_rtas: add ibm, configure-connector RTAS interface
Date: Mon, 16 Mar 2015 22:31:29 -0500
User-agent: alot/0.3.4

Quoting David Gibson (2015-03-12 00:52:10)
> On Thu, Mar 05, 2015 at 08:12:58AM -0600, Michael Roth wrote:
> > Quoting David Gibson (2015-03-04 22:30:40)
> > > On Wed, Mar 04, 2015 at 07:37:08AM -0600, Michael Roth wrote:
> > > > Quoting Michael Roth (2015-03-03 23:50:34)
> > > > > Quoting David Gibson (2015-03-02 23:33:39)
> > > > > > On Mon, Mar 02, 2015 at 10:40:16PM -0600, Michael Roth wrote:
> > > > > > > Quoting David Gibson (2015-03-02 01:02:46)
> > > > > > > > On Thu, Feb 26, 2015 at 09:11:07PM -0600, Michael Roth wrote:
> > > > > > > > > This interface is used to fetch an OF device-tree nodes that 
> > > > > > > > > describes a
> > > > > > > > > newly-attached device to guest. It is called multiple times 
> > > > > > > > > to walk the
> > > > > > > > > device-tree node and fetch individual properties into a 
> > > > > > > > > 'workarea'/buffer
> > > > > > > > > provided by the guest.
> > > > > > > > > 
> > > > > > > > > The device-tree is generated by QEMU and passed to an 
> > > > > > > > > sPAPRDRConnector during
> > > > > > > > > the initial hotplug operation, and the state of these RTAS 
> > > > > > > > > calls is tracked by
> > > > > > > > > the sPAPRDRConnector. When the last of these properties is 
> > > > > > > > > successfully
> > > > > > > > > fetched, we report as special return value to the guest and 
> > > > > > > > > transition
> > > > > > > > > the device to a 'configured' state on the QEMU/DRC side.
> > > > > > > > > 
> > > > > > > > > See docs/specs/ppc-spapr-hotplug.txt for a complete 
> > > > > > > > > description of
> > > > > > > > > this interface.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Michael Roth <address@hidden>
> > > > > > > > 
> > > > > > > > 
> > > > > > > > So, actually, here's probably the best place to explain what I 
> > > > > > > > had in
> > > > > > > > mind for changing the internal interface for this stuff.  I was
> > > > > > > > thinking something like this pseudocode:
> > > > > > > > 
> > > > > > > > struct DRCCCState {
> > > > > > > >         void *fdt;
> > > > > > > >         int offset;
> > > > > > > >         int depth;
> > > > > > > > };
> > > > > > > > 
> > > > > > > > rtas_configure_connector()
> > > > > > > > {
> > > > > > > >         ...
> > > > > > > >         DRCCCState *ccstate;
> > > > > > > >         ...
> > > > > > > > 
> > > > > > > >         /* check parameters, retrieve drc */
> > > > > > > >         ccstate = drc->ccstate;
> > > > > > > > 
> > > > > > > >         if (!ccstate) {
> > > > > > > >                 /* Haven't started configuring yet */
> > > > > > > >                 ccstate = malloc(...);
> > > > > > > >                 /* Retrieve the dt fragment from the backend */
> > > > > > > >                 ccstate->fdt = drck->get_dt(...);
> > > > > > > >                 ccstate->offset = 0;
> > > > > > > >         }
> > > > > > > > 
> > > > > > > >         while (get next tag from fdt) {
> > > > > > > >                 switch (tag)
> > > > > > > >                 case FDT_PROPERTY:
> > > > > > > >                         /* Translate property into rtas return 
> > > > > > > > values */
> > > > > > > >                         return 
> > > > > > > > SPAPR_DR_CC_RESPONSE_NEXT_PROPERTY;
> > > > > > > > 
> > > > > > > >                 /* other cases ... */
> > > > > > > >         }
> > > > > > > >         
> > > > > > > >         /* Fall through only if we've completed streaming out 
> > > > > > > > the dt
> > > > > > > >         */
> > > > > > > > 
> > > > > > > >          /* Tell the back end we've finished configuring */
> > > > > > > >         drck->cc_completed(...);
> > > > > > > >         return SPAPR_DR_CC_RESPONSE_SUCCESS;
> > > > > > > > }
> > > > > > > > 
> > > > > > > > On reset, or anything else which interrupts the configuration 
> > > > > > > > process,
> > > > > > > > just blow away drc->ccstate.
> > > > > > > 
> > > > > > > Ok, that seems reasonable. I took a stab at it here:
> > > > > > > 
> > > > > > >     
> > > > > > > https://github.com/mdroth/qemu/commit/79ce372743da1b63a6fa33e3de1f1daba8ea1fdc
> > > > > > >     https://github.com/mdroth/qemu/commits/spapr-hotplug-pci
> > > > > > 
> > > > > > It's looking pretty close now, thanks for the rework.
> > > > > > 
> > > > > > > It exposes the ccstate as you suggested, via 
> > > > > > > drck->get_cc_state(), and in
> > > > > > > place of drck->cc_completed() I have drck->set_configured() which 
> > > > > > > serves
> > > > > > > roughly the same purpose I think. I opted not to let RTAS handle
> > > > > > > allocation, since it seemed to imply RTAS owns it and not the DRC.
> > > > > > 
> > > > > > So, that was intentional; basically RTAS *does* own the CCstate.  
> > > > > > But
> > > > > > for convenience of index we need connect it to the DRC.  Think of it
> > > > > > like an rtas_priv field in the DRC.
> > > > > > 
> > > > > > In particular I think the CCstate should be opaque to everything
> > > > > > except the RTAS code itself, which means initializing the offset and
> > > > > > depth in RTAS, not in a drck callback.  As far as the drck callback
> > > > > > is concerned, it's supplying a dt fragment, but it doesn't care 
> > > > > > about
> > > > > > the details of how the upper layer communicates that through to the
> > > > > > guest.
> > > > > 
> > > > > Ah ok, so it was about moving the CCState out of DRC, and not just the
> > > > > awkward interface that wraps FDT traversal. So I went ahead and did it
> > > > > as you suggested, but also making it actually opaque, and relying on
> > > > > a couple callbacks that configure-connector passes to
> > > > > drc->begin_configure_connector to handle init/reset of the CCState
> > > > > fields (such as the fdt, and the start offset (which isn't 
> > > > > necessarilly 0)):
> > > > > 
> > > > >   https://github.com/mdroth/qemu/commits/spapr-hotplug-pci
> > > > >   
> > > > > https://github.com/mdroth/qemu/commit/732aa10fa2e41951c396373e7df7d31861322531
> > > > > 
> > > > > I think I have all your other comments addressed, so if that looks ok
> > > > > I'll post v7 soon. Thanks!
> > > > 
> > > > Yikes, just noticed a use-after-free in the new code. Fixed here:
> > > > 
> > > >   
> > > > https://github.com/mdroth/qemu/commit/3fd03f649dc5cd34aa6e2544d38855dd0f8b3708
> > > 
> > > Ok, I'm now getting myself a bit tangled in the various revisions.
> > > However looking at
> > > 
> > > https://github.com/mdroth/qemu/commit/732aa10fa2e41951c396373e7df7d31861322531
> > > 
> > > The ->begin_configure_connector stuff seems unnecessarily
> > > complicated.  Couldn't you just have begin_configure_connector()
> > > return the fdt, then initialize ccs in rtas_ibm_configure_connector()
> > > itself, avoiding the callback-from-a-callback.
> > 
> > We need the fdt, as well as the fdt starting offset, to initialize the CCS.
> 
> Do you actually have a use-case for a non-zero starting offset? Or
> could you simplify by having the individual PCI device always create
> its fdt fragment at offset 0.

Something as simple as:

offset = fdt_add_subnode(fdt, 0, "address@hidden");

Results in offset = 8

I'm not sure exactly why, but I guess a subnode has an inherent offset 
associated
with it.

I've since found that fdt_offset_ptr() can be used to bake the offset into the 
fdt
pointer, so RTAS can treat the offset as 0 from that point forward.

I've implemented a drc->get_fdt() using this approach.

> 
> > I think it's a matter a of taste whether that's those are returned 
> > separately,
> > or through a callback passed via begin_configure_connector. The approach I
> > took just seemed a bit more instructive about what data was needed,
> > and why.
> 
> > drck->get_fdt() and drck->get_fdt_starting_offset() instead of the
> > callback seemed a bit much too specific in purpose to warrant a general
> > interface, and it since we seem to need a reset_ccs anyway (see below),
> > init_ccs seemed like a good place to contain those values.
> 
> Um.. I'm a bit confused by this.  You could return both the fdt
> pointert and offset as one call using pointers or a structure return
> value without needing to invoke a callback-from-a-callback.

True, a get_fdt() could also take a pointer arg to store the offset, so
that's doable. fdt_offset_ptr() is a bit cleaner though IMO.

> 
> > I am fine with just initializing ccs via get_fdt()/get_fdt_starting_offset()
> > beforehand though, but I do think we're stuck with a reset_ccs callback
> > if we're agreed on drck->get_configure_connector_state() == NULL being
> > the primary means to invalidate CCS state.
> 
> Hm.  I'll have to take another look.  I'd really like to keep things
> to a single set of callbacks if possible, rather than having both
> callbacks and counter-callbacks, or whatever you want to call them.

I ran it by Alex during his IBM visit, and it's seeming like this is
turning out a bit more funky than necessary because we're trying to
combine my approach of relying on the DRC to store the state as an
opaque, while still keeping the state opaque to anything but RTAS.

If we move the configure-connector state to a separate list as you
originally suggested the need for wierd callbacks goes away.

This does bring about my original concerns about having a way for the
DRC to reset the state on 1) configure-connector reset, and 2) system reset

1) can be addressed by making the observation that RTAS does know
when to reset the configure-connector state, via rtas-set-indicator(ISOLATE).
So if we do it that way, RTAS can zap/invalidate a CCS as the same points
DRC would have do it. We leak a little bit of the DRC state-machine,
but it's fairly trivial.

2) can be addressed by registering a separate reset handler that clears the
   CCS list (which I've hung off of sPAPREnvironment, with
   spapr_ccs_{add,remove,reset_hook} to work with the list using drc_index
   as the key)

I've pushed the changes here:

https://github.com/mdroth/qemu/commits/spapr-hotplug-pci

2a6f2b2 *spapr_drc: make prop_get_fdt() standalone
9140fe4 *spapr_drc: add get_fdt() and set_configured()
d242fed *spapr_rtas: ibm,configure-connector, use get_fdt()/set_configured()
984ee1b *spapr_drc: drop the old stuff

let me know if you'd prefer I just submit a v8.

> 
> > > I'm also not sure that reset_ccs is worth abstracting.  I think it
> > > would be reasonable just to say that freeing and setting to NULL the
> > > ccs link is sufficient.
> > 
> > But after allocation, rtas_configure_connector hands over the ccs link
> > to DRC, and it's local copy goes out of scope. The only way to retrieve
> > it is via get_configure_connector_state(), so if the idea is to return
> > NULL open reset, we have no way to free the ccs structure. If we simply
> > have DRC free it, we violate the idea that ccs state is opaque. So given
> > the init_ccs callback above, it made sense to handle the free via a
> > reset_ccs.
> > 
> > > 
> > > That said, the current reset_ccs doesn't appear to be quite right,
> > > since it frees the ccs structure, but not the fdt fragment it points
> > > to.  I'm not sure how awkward it would be to force them into a common
> > > allocation to avoid that.
> > 
> > You mean freeing the actual FDT data? In this case the FDT pointer is
> > simply a pointer to the copy the DRC has, and the lifecycle of the FDT
> > is tied to the device lifecycle, and spans beyond that of a CCS (since
> > we can configure/unconfigure the same device multiple times without
> > unplugging in between)
> 
> Oh, ok.  Why do you need a copy in ccstate then?  The rtas code has
> access to the drc structure as well.

Hmm, true, we don't actually need a copy. It makes sense a little more
sense when using the fdt_offset_ptr() approach to get rid of the offset,
and I think now it makes the separation between DRC and
rtas-configure-connector a bit more complete, but we could still just
call drc->get_fdt() each time. Let me know if that's preferable and I'll
work it in for the next submission.

> 
> -- 
> David Gibson                    | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
>                                 | _way_ _around_!
> http://www.ozlabs.org/~dgibson




reply via email to

[Prev in Thread] Current Thread [Next in Thread]