qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforceme


From: Niklas Schnelle
Subject: Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement
Date: Tue, 28 Jul 2020 10:59:10 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0


On 7/27/20 5:40 PM, Pierre Morel wrote:
> 
> 
> On 2020-07-23 18:29, Alex Williamson wrote:
>> On Thu, 23 Jul 2020 11:13:55 -0400
>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>
>>> I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
>>> and block MMIO access on disabled memory' vfio-pci via qemu on s390x
>>> fails spectacularly, with errors in qemu like:
>>>
>>> qemu-system-s390x: vfio_region_read(0001:00:00.0:region0+0x0, 4) failed: 
>>> Input/output error
>>>
>>>  From read to bar 0 originating out of 
>>> hw/s390x/s390-pci-inst.c:zpci_read_bar().
>>>
>>> So, I'm trying to figure out how to get vfio-pci happy again on s390x.  From
>>> a bit of tracing, we seem to be triggering the new trap in
>>> __vfio_pci_memory_enabled().  Sure enough, if I just force this function to
>>> return 'true' as a test case, things work again.
>>> The included patch attempts to enforce the setting, which restores 
>>> everything
>>> to working order but also triggers vfio_bar_restore() in the process....  So
>>> this isn't the right answer, more of a proof-of-concept.
>>>
>>> @Alex: Any guidance on what needs to happen to make qemu-s390x happy with 
>>> this
>>> recent kernel change?
>>
>> Bummer!  I won't claim to understand s390 PCI, but if we have a VF
>> exposed to the "host" (ie. the first level where vfio-pci is being
>> used), but we can't tell that it's a VF, how do we know whether the
>> memory bit in the command register is unimplemented because it's a VF
>> or unimplemented because the device doesn't support MMIO?  How are the
>> device ID, vendor ID, and BAR registers virtualized to the host?  Could
>> the memory enable bit also be emulated by that virtualization, much
>> like vfio-pci does for userspace?  If the other registers are
>> virtualized, but these command register bits are left unimplemented, do
>> we need code to deduce that we have a VF based on the existence of MMIO
>> BARs, but lack of memory enable bit?  Thanks,
>>
>> Alex
> 
> Alex, Matt,
> 
> in s390 we have the possibility to assign a virtual function to a logical 
> partition as function 0.
> In this case it can not be treated as a virtual function but must be treated 
> as a physical function.
> This is currently working very well.
Can you explain why it must be treated as a physical function and must not have 
is_virtfn set?
I'm currently reworking my fix for PF/VF linking not happening for all ways to 
attach a
VF and in that I intend to set is_virtfn = 1 also for VFs that are not linked 
with a PF
including those attached to an LPAR.

So far I really can not see a reason why that should not work since I was wrong 
before
and Firmware does tell us that these are indeed VFs (zdev->is_physfn == 0).
AFAIK on nearly all platforms guests will often have a VF as function zero on a 
bus
because that is what I expect to happen if you pass it through as a PCI 
function.
So unless I'm missing something, that just makes LPAR look more like a QEMU 
guest on
another platform which is very likely much more well tested than treating a VF 
as
a PF as we have been doing.
> However, these functions do not set PCI_COMMAND_MEMORY as we need.
> 
> Shouldn't we fix this inside the kernel, to keep older QMEU working?
> 
> Then would it be OK to add a new bit/boolean inside the 
> pci_dev/vfio_pci_device like, is_detached_vfn, that we could set during 
> enumeration and test inside __vfio_pci_memory_enabled() to return true?
This does not make sense to me, as I wrote above it's totally normal for VMs to 
see VFs detached
from the PF as they are passed-through to a QEMU guest so IMHO that's already 
covered by the meaning
of is_virtfn.
> 
> In the enumeration we have the possibility to know if the function is a 
> HW/Firmware virtual function on devfn 0 or if it is created by SRIOV.
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]