qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH v3 05/13] pcie: helper functions for pcie capabi


From: Michael S. Tsirkin
Subject: [Qemu-devel] Re: [PATCH v3 05/13] pcie: helper functions for pcie capability and extended capability.
Date: Sun, 19 Sep 2010 13:45:33 +0200
User-agent: Mutt/1.5.20 (2009-12-10)

On Sun, Sep 19, 2010 at 01:56:23PM +0900, Isaku Yamahata wrote:
> On Wed, Sep 15, 2010 at 02:43:10PM +0200, Michael S. Tsirkin wrote:
> > > +/***************************************************************************
> > > + * pci express capability helper functions
> > > + */
> > > +void pcie_notify(PCIDevice *dev, uint16_t vector, bool trigger, int 
> > > level)
> > 
> > Why is this not static? It makes sense for internal stuff possibly,
> > but I think functions will need to know what to do: they can't
> > treat msi/msix/irq identically anyway.
> 
> The aer code which I split out uses it.

Move it there?

> > The API seems confusing, I think this is what is creating
> > code for you. Specifically level = 0 does not notify at all.
> > So I think we need two:
> > 1. pcie_assert_interrupt which sends msi or sets level to 1
> > 2. pcie_deassert_interrupt which sets level to 0, or nothing
> >    for non msi.
> > 
> > Then below you can e.g.
> > if (!sltctrl) {
> >     pcie_deassert(...);
> >     return;
> > }
> 
> As I already mentioned in the other mail, when to assert MSI
> can be different from INTx. The express spec utilizes it.
> For example hot plug, aer, and so on.

My comment really has to do with the API.
The API that gets level and trigger values is confusing.
Two functions assert and deassert would be clearer.

> 
> > > +                    vector, trigger, level);
> > > +    if (msix_enabled(dev)) {
> > > +        if (trigger) {
> > > +            msix_notify(dev, vector);
> > > +        }
> > > +    } else if (msi_enabled(dev)) {
> > > +        if (trigger){
> > > +            msi_notify(dev, vector);
> > > +        }
> > > +    } else {
> > > +        qemu_set_irq(dev->irq[0], level);
> > 
> > always 0? really? This is INTA# - is this what the spec says?
> 
> It depends on each device implementation. So I just picked INTA#.
> Okay, I'll make it customizable by using property.

This might not even be static.
What if device can assert multiple INTx# pins?
I think a saner way is just to expose this in the API.

> 
> > > +    }
> > > +}
> > > +
> > > +int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type, uint8_t 
> > > port)
> > > +{
> > > +    int exp_cap;
> > > +    uint8_t *pcie_cap;
> > > +
> > > +    assert(pci_is_express(dev));
> > > +
> > > +    exp_cap = pci_add_capability(dev, PCI_CAP_ID_EXP, offset,
> > > +                                 PCI_EXP_VER2_SIZEOF);
> > > +    if (exp_cap < 0) {
> > > +        return exp_cap;
> > > +    }
> > > +    dev->exp.exp_cap = exp_cap;
> > > +
> > > +    /* already done in pci_qdev_init() */
> > > +    assert(dev->cap_present & QEMU_PCI_CAP_EXPRESS);
> > 
> > Hmm. Why do we set this flag in qdev init but do the
> > rest of it here?
> 
> pci_qdev_init() needs to know it for allocating config array.
> 

Can we do everything in qdev init function?

> > > +        return -EBUSY;
> > > +    }
> > > +
> > > +    if (state) {
> > > +        if (PCI_FUNC(pci_dev->devfn) == 0) {
> > > +            /* event is per slot. Not per function
> > > +             * only generates event for function = 0.
> > > +             * When hot plug, populate functions > 0
> > > +             * and then add function = 0 last.
> > > +             */
> > > +            pcie_cap_slot_event(d, PCI_EXP_HP_EV_PDC, 
> > > PCI_EXP_SLTSTA_PDS);
> > > +        }
> > > +    } else {
> > > +        PCIBridge *br;
> > > +        PCIBus *bus;
> > > +        DeviceState *next;
> > > +        if (PCI_FUNC(pci_dev->devfn) != 0) {
> > > +            /* event is per slot. Not per function.
> > > +               accepts function = 0 only. */
> > > +            return -EINVAL;
> > 
> > Can user or guest trigger this?
> > If yes print an error.
> > IF no, assert.
> 
> Not yet. When multi function device hot plug is supported in some sense,
> it will be triggered in some way. The code is just place holder until then.
> Some TODO comment should have been there.

+ assert for now.

> 
> > > +                level = 1;
> > > +            } else {
> > > +                level = 0;
> > > +            }
> > 
> > What is this trying to implement?
> 
> hot plut event notification. Please refer to
> 6.7. PCI Express Hot-Plug Support
> 6.7.3. PCI Express Hot-Plug Events
> 6.7.3.4. Software Notification of Hot-Plug Events
> 
> > > +        assert(offset == PCI_CONFIG_SPACE_SIZE);
> > > +        pci_set_long(dev->config + offset,
> > > +                     PCI_EXT_CAP(0, 0, PCI_EXT_CAP_NEXT(header)));
> > > +    }
> > > +
> > > +    /* Make those registers read-only reserved zero */
> > 
> > So you make them readonly in both add and delete?
> > delete should revert add: let's put the
> > masks back the way they were: writeable.
> 
> In fact zeroing in add is redundant, but I added it following msix code.

It is not redundand there as registers are writeable by default:
    memset(dev->wmask + PCI_CONFIG_HEADER_SIZE, 0xff,
           config_size - PCI_CONFIG_HEADER_SIZE);


> 
> The usage model is
> - At first the registers are unused, so should be read only zero.

You can't know they are unused: in PCI spec everything
outside capability list is vendor specific so it
is writeable to let drivers do their thing.

> - add capability
>   (zeroing is redundant)
> - the device specific code sets config/mask/cmask/w1cmask as it likes
> - del capability
>   This makes the register unused, i.e. read only zero.

as above.

> Maybe it's possible to make zeroing them caller's responsible.
> 
> 
> > > +    /* the bits match the bits in Slot Control/Status registers.
> > > +     * PCI_EXP_HP_EV_xxx = PCI_EXP_SLTCTL_xxxE = PCI_EXP_SLTSTA_xxx
> > 
> > Is it important that they match?
> > We don't assume this in code, do we?
> 
> Yes, we're using it when setting stlsta.

Then they should be defined one through the other,
or just use one set of macros and have a comment in code
explaining that layout is identical.

> 
> > > +    /* events not listed aren't supported */
> > > +};
> > > +
> > > +typedef void (*pcie_flr_fn)(PCIDevice *dev);
> > 
> > Is flr special?  Can't we use the generic reset handlers?
> > If not why?
> 
> Reset(cold reset/warm reset) in generic sense corresponds to
> conventional reset in express sense which corresponds to PCI RST#.
> On the other hand FLR is different from the conventional one.
> 
> Cited from the spec
> 
> 6.6. PCI Express Reset - Rules
> 6.6.1. Conventional Reset
>  Conventional Reset includes all reset mechanisms other than Function
>  Level Reset.
> 6.6.2. Function-Level Reset (FLR)
> 
> Most devices would implement FLR as just calling something like
> qdev_reset. But the spec differentiates FLR from conventional reset,
> so the generic pcie layer should do.

I am not sure I agree. If most devices don't care, or
behave almost identically, it's ok to just call qdev_reset:
devices can find out that FLR is in progress by looking
at the config space (bit will be set there) - and we can
add a helper pcie_flr_in_progress() to test it.

This way most devices will work out of box.
Only if behaviour is mostly different would it make sense
to have a completely separate reset path.
What happens with devices you implemented?

> -- 
> yamahata



reply via email to

[Prev in Thread] Current Thread [Next in Thread]