qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH] pci: Use PCI aliases when determining devic


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [RFC PATCH] pci: Use PCI aliases when determining device IOMMU address space
Date: Wed, 27 Mar 2019 11:35:35 -0400

On Wed, Mar 27, 2019 at 02:25:00PM +0800, Peter Xu wrote:
> On Tue, Mar 26, 2019 at 04:55:19PM -0600, Alex Williamson wrote:
> > Conventional PCI buses pre-date requester IDs.  An IOMMU cannot
> > distinguish by devfn & bus between devices in a conventional PCI
> > topology and therefore we cannot assign them separate AddressSpaces.
> > By taking this requester ID aliasing into account, QEMU better matches
> > the bare metal behavior and restrictions, and enables shared
> > AddressSpace configurations that are otherwise not possible with
> > guest IOMMU support.
> > 
> > For the latter case, given any example where an IOMMU group on the
> > host includes multiple devices:
> > 
> >   $ ls  /sys/kernel/iommu_groups/1/devices/
> >   0000:00:01.0  0000:01:00.0  0000:01:00.1
> 
> [1]
> 
> > 
> > If we incorporate a vIOMMU into the VM configuration, we're restricted
> > that we can only assign one of the endpoints to the guest because a
> > second endpoint will attempt to use a different AddressSpace.  VFIO
> > only supports IOMMU group level granularity at the container level,
> > preventing this second endpoint from being assigned:
> > 
> > qemu-system-x86_64 -machine q35... \
> >   -device intel-iommu,intremap=on \
> >   -device pcie-root-port,addr=1e.0,id=pcie.1 \
> >   -device vfio-pci,host=1:00.0,bus=pcie.1,addr=0.0,multifunction=on \
> >   -device vfio-pci,host=1:00.1,bus=pcie.1,addr=0.1
> > 
> > qemu-system-x86_64: -device vfio-pci,host=1:00.1,bus=pcie.1,addr=0.1: vfio \
> > 0000:01:00.1: group 1 used in multiple address spaces
> > 
> > However, when QEMU incorporates proper aliasing, we can make use of a
> > PCIe-to-PCI bridge to mask the requester ID, resulting in a hack that
> > provides the downstream devices with the same AddressSpace, ex:
> > 
> > qemu-system-x86_64 -machine q35... \
> >   -device intel-iommu,intremap=on \
> >   -device pcie-pci-bridge,addr=1e.0,id=pci.1 \
> >   -device vfio-pci,host=1:00.0,bus=pci.1,addr=1.0,multifunction=on \
> >   -device vfio-pci,host=1:00.1,bus=pci.1,addr=1.1
> > 
> > While the utility of this hack may be limited, this AddressSpace
> > aliasing is the correct behavior for QEMU to emulate bare metal.
> > 
> > Signed-off-by: Alex Williamson <address@hidden>
> 
> The patch looks sane to me even as a bug fix since otherwise the DMA
> address spaces used under misc kinds of PCI bridges can be wrong, so:
> 
> Reviewed-by: Peter Xu <address@hidden>
> 
> Though I have a question that confused me even before: Alex, do you
> know why all the context entry of the devices in the IOMMU root table
> will be programmed even if the devices are under a pcie-to-pci bridge?
> I'm giving an example with above [1] to be clear: in that case IIUC
> we'll program context entries for all the three devices (00:01.0,
> 01:00.0, 01:00.1) but they'll point to the same IOMMU table.  DMAs of
> devices 01:00.0 and 01:00.1 should always been tagged with 01:00.0 on
> bare metal

What makes you think so?

PCI Express spec says:

Requester ID

The combination of a Requester's Bus Number, Device Number, and Function
Number that uniquely identifies the Requester. With an ARI Requester ID, bits
traditionally used for the Device Number field are used instead to expand the
Function Number field, and the Device Number is implied to be 0.



> and then why we bother to program the context entry of
> 01:00.1?  It seems never used.
> 
> (It should be used for current QEMU to work with pcie-to-pci bridges
>  if without this patch, but I feel like I don't know the real answer
>  behind)
> 
> Thanks,
> 
> -- 
> Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]