qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforceme


From: Niklas Schnelle
Subject: Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement
Date: Tue, 28 Jul 2020 16:13:57 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0


On 7/28/20 2:52 PM, Alex Williamson wrote:
> On Tue, 28 Jul 2020 11:33:55 +0200
> Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> 
>> On 7/27/20 6:47 PM, Alex Williamson wrote:
>>> On Mon, 27 Jul 2020 17:40:39 +0200
>>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>>   
>>>> On 2020-07-23 18:29, Alex Williamson wrote:  
>>>>> On Thu, 23 Jul 2020 11:13:55 -0400
>>>>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>>>>     
>>>>>> I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
>>>>>> and block MMIO access on disabled memory' vfio-pci via qemu on s390x
>>>>>> fails spectacularly, with errors in qemu like:  
>> ... snip ...
>>>>
>>>> Alex, Matt,
>>>>
>>>> in s390 we have the possibility to assign a virtual function to a 
>>>> logical partition as function 0.
>>>> In this case it can not be treated as a virtual function but must be 
>>>> treated as a physical function.
>>>> This is currently working very well.
>>>> However, these functions do not set PCI_COMMAND_MEMORY as we need.  
>>>
>>> Where is the vendor and device ID virtualization done for these
>>> devices, we can't have a PF with IDs 0000:0000.  
>> Pierre doesn't mean the Device/Vendor IDs he means it has devfn == 0
>> so it is the mandatory function zero on it's PCI bus, where until recently
>> we always had only one function per bus but with the recent multi-function
>> support it can act more like on other platforms with several PCI functions
>> sharing the same Bus e.g. a PF and the VFs created through sriov_numvfs.
>> That's why I'm saying that having devfn == 0 should not be very special for 
>> a VF
>> passed to a VM and I really don't see where it would not act like a VF passed
>> from any other Hypervisor.
> 
> My question is relative to other registers on VFs that are not
> implemented in hardware, not the device address.  In addition to the
> memory bit of the command register, SR-IOV VFs do not implement the
> vendor and device IDs, these are to be virtualized from the values
> provided in the PF SR-IOV capability.  It wouldn't make much sense to
> expose a device with no PCI vendor or device ID, so I assume the z/VM
> hypervisor is virtualizing these, but not the memory bit.
Ahh, yes I see. On Z these are actually already virtualized at the LPAR
level as part of the firmware based scanning I mentioned that is the
reason for pdev->no_vf_scan. This is true even for VFs created
through sriov_numvfs in the host which is why I did not realize these
don't come from hardware, even though looking at drivers/pci/iov.c it's
pretty obvious and I did touch that code before. Sorry for the confusion.
>  
>> The only really tricky part in my opinion is where during the "probing"
>> we do set is_virtfn so it happens both for VFs passed-through from z/VM
>> or LPAR and VFs created through sriov_numvfs which unlike on other platforms
>> are also scanned by Firmware (pdev->no_vf_scan disables the Linux side 
>> scanning).
>> With the fix I'm currently testing I had to do this in 
>> pcibios_enable_device()
>> because I also create sysfs links between VFs and their parent PFs and those
>> need the sysfs entries to be already created, which makes the more 
>> apropriately
>> sound pcibios_bus_add_device() too early.
> 
> 
> I don't think it would be wise to set is_virtfn without a physfn being
> present in the OS view.  I believe pci_dev.is_virtfn implies
> pci_dev.physfn points to the PF device.  Thanks> 
> Alex
Thank you a lot for that hint, you're absolutely right, while the
drivers do work with is_virtfn == 1 && physffn == NULL
vfio-pci gets very confused. And sorry Pierre for doubting
your is_virtfn_detached idea, this thing really is confusing.
> 
>>>> Shouldn't we fix this inside the kernel, to keep older QMEU working?
>>>>
>>>> Then would it be OK to add a new bit/boolean inside the 
>>>> pci_dev/vfio_pci_device like, is_detached_vfn, that we could set during 
>>>> enumeration and test inside __vfio_pci_memory_enabled() to return true?  
>>>
>>> Probably each instance of is_virtfn in vfio-pci should be looked at to
>>> see if it applies to s390.  If we're going to recognize this as a VF,
>>> I'd rather we complete the emulation that the lower level hypervisor
>>> has missed.  If we can enable all the is_virtfn code on s390, then we
>>> should probably cache is_virtfn on the vfio_pci_device object and allow
>>> s390 a place to set it once at probe or enable time.
>>>   
>>>> In the enumeration we have the possibility to know if the function is a 
>>>> HW/Firmware virtual function on devfn 0 or if it is created by SRIOV.
>>>>
>>>> It seems an easy fix without side effects.
>>>>
>>>> What do you think?  
>>>
>>> It sure seems preferable to recognize that it is a VF in the kernel
>>> than to require userspace to have arch specific hacks.  Thanks,
>>>
>>> Alex
>>>   
>>
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]