qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 8/8] spapr_pci: Use XICS interrupt allocator


From: Alexey Kardashevskiy
Subject: Re: [Qemu-devel] [PATCH v2 8/8] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
Date: Thu, 22 May 2014 20:53:24 +1000

On 05/22/2014 05:16 PM, Alexander Graf wrote:> 
> 
>> Am 22.05.2014 um 08:53 schrieb Alexey Kardashevskiy <address@hidden>:
>>
>>> On 05/21/2014 10:42 PM, Alexey Kardashevskiy wrote:
>>>> On 05/21/2014 08:35 PM, Alexander Graf wrote:
>>>>
>>>>> On 21.05.14 12:13, Alexey Kardashevskiy wrote:
>>>>>> On 05/21/2014 07:50 PM, Alexander Graf wrote:
>>>>>>> On 21.05.14 11:33, Alexey Kardashevskiy wrote:
>>>>>>>> On 05/21/2014 07:13 PM, Alexander Graf wrote:
>>>>>>>>> On 21.05.14 11:11, Michael S. Tsirkin wrote:
>>>>>>>>>> On Wed, May 21, 2014 at 11:06:09AM +0200, Alexander Graf wrote:
>>>>>>>>>>> On 21.05.14 10:52, Alexey Kardashevskiy wrote:
>>>>>>>>>>>> On 05/21/2014 06:40 PM, Alexander Graf wrote:
>>>>>>>>>>>>> On 15.05.14 11:59, Alexey Kardashevskiy wrote:
>>>>>>>>>>>>> Currently SPAPR PHB keeps track of all allocated MSI/MISX
>>>>>>>>>>>>> interrupt as
>>>>>>>>>>>>> XICS used to be unable to reuse interrupts which becomes a
>>>>>>>>>>>>> problem for
>>>>>>>>>>>>> dynamic MSI reconfiguration which is happening on guest driver
>>>>>>>>>>>>> reload or
>>>>>>>>>>>>> PCI hot (un)plug. Another problem is that PHB has a limit of 
>>>>>>>>>>>>> devices
>>>>>>>>>>>>> supporting MSI/MSIX (SPAPR_MSIX_MAX_DEVS=32) and there is no good
>>>>>>>>>>>>> reason
>>>>>>>>>>>>> for that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This makes use of new XICS ability to reuse interrupts.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This removes cached MSI configuration from SPAPR PHB so the first
>>>>>>>>>>>>> IRQ
>>>>>>>>>>>>> number
>>>>>>>>>>>>> of a device is stored in MSI/MSIX config space so there is no
>>>>>>>>>>>>> need to
>>>>>>>>>>>>> store
>>>>>>>>>>>>> this anywhere else. From now on, SPAPR PHB only keeps flags 
>>>>>>>>>>>>> telling
>>>>>>>>>>>>> what
>>>>>>>>>>>>> type
>>>>>>>>>>>>> of interrupt for which device it has configured in order to return
>>>>>>>>>>>>> error if
>>>>>>>>>>>>> (for example) MSIX was enabled and the guest is trying to disable
>>>>>>>>>>>>> MSI
>>>>>>>>>>>>> which
>>>>>>>>>>>>> it has not enabled.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This removes a limit for the maximum number of MSIX-enabled 
>>>>>>>>>>>>> devices
>>>>>>>>>>>>> per PHB,
>>>>>>>>>>>>> now XICS and PCI bus capacity are the only limitation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This changes migration stream as it fixes
>>>>>>>>>>>>> vmstate_spapr_pci_msi::name
>>>>>>>>>>>>> which was
>>>>>>>>>>>>> wrong since the beginning.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This fixed traces to be more informative.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Alexey Kardashevskiy <address@hidden>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>
>>>>>>>>>>>>> In reality either MSIX or MSI is enabled, never both. So I could
>>>>>>>>>>>>> remove
>>>>>>>>>>>>> msi/msix
>>>>>>>>>>>>> bitmaps from this patch, would it make sense?
>>>>>>>>>>>> Is this a hard requirement? Does a device have to choose between
>>>>>>>>>>>> MSIX and
>>>>>>>>>>>> MSI or could it theoretically have both enabled? Is this a PCI
>>>>>>>>>>>> limitation,
>>>>>>>>>>>> a PAPR/XICS limitation or just a limitation of your implementation?
>>>>>>>>>>> My implementation does not have this limitation, I asked if I can
>>>>>>>>>>> simplify
>>>>>>>>>>> code by introducing one :)
>>>>>>>>>>>
>>>>>>>>>>> I cannot see any reason why PCI cannot have both MSI and MSIX 
>>>>>>>>>>> enabled
>>>>>>>>>>> but
>>>>>>>>>>> it does not seem to be used by anyone => cannot debug and confirm.
>>>>>>>>>>>
>>>>>>>>>>> PAPR spec assumes that if the guest tries enabling MSIX when MSI is
>>>>>>>>>>> already
>>>>>>>>>>> enabled, this is a "change", not enabling both types. But it also
>>>>>>>>>>> says MSI
>>>>>>>>>>> and MSIX vector numbers are not shared. Hm.
>>>>>>>>>> Yeah, I'm not aware of any limitation on hardware here and I'd
>>>>>>>>>> rather not impose one.
>>>>>>>>>>
>>>>>>>>>> Michael, do you know of any hardware that uses MSI *and* MSI-X at
>>>>>>>>>> the same time?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Alex
>>>>>>>>> No, and the PCI spec says:
>>>>>>>>>      A function is permitted to implement both MSI and MSI-X, but
>>>>>>>>> system
>>>>>>>>>      software is
>>>>>>>>>      prohibited from enabling both at the same time. If system 
>>>>>>>>> software
>>>>>>>>>      enables both at the same time, the result is undefined.
>>>>>>>> Ah, cool. So yes Alexey, feel free to impose it :).
>>>>>>> Heh. This solves just half of the problem - I still have to keep track 
>>>>>>> of
>>>>>>> what device got MSI/MSIX configured via that ibm,change-msi interface. I
>>>>>>> was hoping I can store such flag somewhere in a device PCI config space
>>>>>>> but
>>>>>>> MSI/MSIX enable bit is not good as it is not set when those calls are
>>>>>>> made.
>>>>>>> And I cannot rely on address/data fields much as the guest can change 
>>>>>>> them
>>>>>>> (I already use them to store IRQ numbers and btw it is missing checks 
>>>>>>> when
>>>>>>> I read them back for disposal, I'll fix in next round).
>>>>>>>
>>>>>>> Or on "enable" event I could put IRQ numbers to .data of MSI config 
>>>>>>> space
>>>>>>> and on "disable" check if it is not zero, then configuration took place,
>>>>>>> then I can remove my msi[]/msix[] flag arrays. If the guest did any 
>>>>>>> change
>>>>>>> to MSI/MSIX config space (it does not on SPAPR except weird selftest
>>>>>>> cases), I compare .data with what ICS can possibly have and either 
>>>>>>> reject
>>>>>>> "disable" or handle it and if it breaks XICS - that's too bad for the
>>>>>>> stupid guest. Would that be acceptable?
>>>>>> Can't you prohibit the guest from writing to the MSI configuration
>>>>>> registers itself? Then you don't need to do sanity checks.
>>>>>
>>>>> I could for emulated devices but VFIO uses the same code. For example,
>>>>> there is an IBM SCSI IPR card which does a "self test". For that, it saves
>>>>> MSIX BAR content, does reboot via some backdoor interface and restores 
>>>>> MSIX
>>>>> BAR. It has been solved for VFIO in the host kernel by restoring MSIX data
>>>>> from cached values when guest is trying to restore it with what it thinks
>>>>> is actual MSIX data (it is virtualized because of x86). But there is cache
>>>>
>>>> We already have a cache because we don't access the real PCI registers with
>>>> msi_set_message(), no?
>>>
>>>
>>> For emulated devices there is no cache. And in any case the guest is
>>> allowed to write to it... Who knows what AIX does? I do not.
>>
>>
>> Tried GHashTable for keeping bus:dev.fn <-> (irq, num), more or less ok but
>> how to migrate such thing? Temporary cache somewhere and then unpack it? Or
>> use old style migration callbacks?
> 

> Could you try to introduce a new vmstate type that just serializes and
> deserializes hash tables? Maybe there is already a serialization
> function for it in glib?

I have not found any (most likely I do not know how to search there),
I added mine, here are VMSTATE_HASH + its use for SPAPR.


Is this a movement to right direction? I need to put key/value sizes
into VMSTATE definition somehow but do not really want to touch
VMStateField.




Alexey Kardashevskiy (2):
  vmstate: Add helper to enable GHashTable migration
  spapr_pci: Use XICS interrupt allocator and do not cache interrupts in
    PHB

 hw/ppc/spapr_pci.c          | 144 +++++++++++++++++---------------------------
 include/hw/pci-host/spapr.h |  13 ++--
 include/migration/vmstate.h |  10 +++
 include/qemu-common.h       |  13 ++++
 trace-events                |   5 +-
 vmstate.c                   |  54 +++++++++++++++++
 6 files changed, 141 insertions(+), 98 deletions(-)

-- 
1.9.rc0




reply via email to

[Prev in Thread] Current Thread [Next in Thread]