qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [RFC PATCH 04/26] ppc/xive: introduce a skeleton for the


From: David Gibson
Subject: Re: [Qemu-ppc] [RFC PATCH 04/26] ppc/xive: introduce a skeleton for the XIVE interrupt controller model
Date: Mon, 24 Jul 2017 13:28:53 +1000
User-agent: Mutt/1.8.3 (2017-05-23)

On Fri, Jul 21, 2017 at 06:21:31PM +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2017-07-21 at 17:50 +1000, David Gibson wrote:
> > On Wed, Jul 19, 2017 at 02:02:18PM +1000, Benjamin Herrenschmidt wrote:
> > > On Wed, 2017-07-19 at 13:08 +1000, David Gibson wrote:
> > > > 
> > > > I'm somewhat uncomfortable with an irq allocater here in the intc
> > > > code.  As a rule, irq allocation is the responsibility of the machine,
> > > > not any sub-component.  Furthermore, it should allocate in a way which
> > > > is repeatable, since they need to stay stable across reboots and
> > > > migrations.
> > > > 
> > > > And, yes, we have an allocator of sorts in XICS - it has caused a
> > > > number of problems in the past.
> > > 
> > > So....
> > > 
> > > For a bare metal model (which we don't have yet) of XIVE, the IRQ
> > > numbering is entirely an artifact of how the HW is configured. There
> > > should thus be no interrupt numbers visible to qemu.
> > 
> > Uh.. I don't entirely follow.  Do you mean that during boot the guest
> > programs the irq numbers into the various components?
> 
> I said a "bare metal model" but yes. Pretty much. 

Right, by "guest" I meant the kernel running under qemu, even if its
running on a bare-metal equivalent platform.

> > In that case this allocator stuff definitely doesn't belong on the
> > xive code.
> > 
> > > For a PAPR model things are a bit different, but if we want to
> > > maximize code re-use between the two, we probably need to make sure
> > > the interrupts "allocated" by the machine for XIVE can be represented
> > > by the HW model.
> > > 
> > > That means:
> > > 
> > >  - Each chip has a range (high bits are the block ID, which maps to a
> > > chip, low bits, around 512K to 1M interrupts is the per-chip space).
> > > 
> > >  - Interrupts 0...N of that range (N depends on how much backing
> > > memory and MMIO space is provisioned for each chip) are "generic IPIs"
> > > which are somewhat generic interrupt source that can be triggered with
> > > an MMIO store and routed to any target. Those are used in PAPR for
> > > things like IPIs and some type of accelerator interrupts.
> > > 
> > >  - Portions of that range (which may or may not overlap the 0...N
> > > above, if they do they "shadow" the generic interrupts) can be
> > > configured to be the HW sources from the various PCIe bridges and
> > > the PSI controller.
> > 
> > Err.. I'm confused how this not sure this relates to spapr.  There are
> > no chips or PSI there, and the PCI bridges aren't really the same
> > thing.
> 
> The above is the HW model, sorry for the confusion. With a few comments
> about how they are used in PAPR.
> 
> So yes, in PAPR there's an "allocator" because the hypervisor will
> create a guest "virtual" (or logical to use PAPR terminology) interrupt
> number space, in order to represents the various interrupts into the
> guest.

Ok, but are each of those logical irqs bound to a specific device/PHB
line/whatever, or can they be configured by the guest?

> Those numbers however are just tokens, they don't have to represent any
> real HW concept. So they can be "allocated" in a rather fixed way, for
> example, you could have something like a fixed map where you put all
> the PCI interrupts at a certain number (a factor of the PHB# with room
> or a fix number per PHB, maybe 16K or so, the HW does 4K max). Another
> based would have a chunk of "general purpose" IPIs (for use for actual
> IPIs and for other things to come). And a range for the virtual device
> interrupts for example. Or you can just use an allocator.

Hm.  So what I'm meaning by an "allocator" is something at least
partially dynamic.  Something you say "give me an irq" and it gives
you the next available or similar.  As opposed to any mapping from
devices to (logical) irqs, which the machine will need to supply one
way or another.

> But it's fundamentally an allocator that sits in the hypervisor, so in
> our case, I would say in the spapr "component" of XIVE, rather than the
> XIVE HW model itself.

Maybe..

> Now what Cedric did, because XIVE is very complex and we need something
> for PAPR quickly, is not a complete HW model, but a somewhat simplified
> one that only handles what PAPR exposes. So in that case where the
> allocator sits is a bit of a TBD...

Hm, ok.  My concern here is that "dynamic" allocation of irqs at the
machine type level needs extreme caution, or the irqs may not be
stable which will generally break migration.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]