[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED
From: |
Alex Williamson |
Subject: |
Re: [Qemu-devel] VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED PCI bars |
Date: |
Mon, 18 Feb 2013 21:24:34 -0700 |
On Tue, 2013-02-19 at 04:05 +0000, Bhushan Bharat-R65777 wrote:
>
> > -----Original Message-----
> > From: Alex Williamson [mailto:address@hidden
> > Sent: Thursday, February 14, 2013 11:57 PM
> > To: Bhushan Bharat-R65777
> > Cc: Yoder Stuart-B08248; Wood Scott-B07421; Graf Alexander-B36701; qemu-
> > address@hidden; address@hidden
> > Subject: Re: VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED PCI
> > bars
> >
> > On Thu, 2013-02-14 at 08:22 +0000, Bhushan Bharat-R65777 wrote:
> > > Hi Alex Williamson,
> > >
> > > I am able (with some hacks :)) to directly assign the e1000 PCI device
> > > to KVM guest using VFIO on Freescale device. One of the problem I am
> > > facing is about the DMA mapping in IOMMU for PCI device BARs. On
> > > Freescale devices, the mmaped PCI device BARs are not required to be
> > > mapped in IOMMU. Typically the flow of in/out transaction (from CPU)
> > > is:
> > >
> > > Incoming flow:
> > >
> > > -----| |----------| |---------------|
> > |-------------|
> > > CORE |<----<------<-----<--| IOMMU |<---<---<|
> > > PCI-Controller|<------<-----
> > <----<| PCI device |
> > > -----| |----------| |---------------|
> > |-------------|
> > >
> > > Outgoing Flow: IOMMU is bypassed for out transactions
> > >
> > > -----| |----------| |---------------|
> > |-------------|
> > > CORE |>---->------>----| | IOMMU | ->-->|
> > > PCI-Controller|>------>-----
> > >---->| PCI device |
> > > -----| | |----------| ^ |---------------|
> > |-------------|
> > > | |
> > > |------------------|
> >
> > Mapping the BAR into the IOMMU isn't for this second use case, CPU to
> > device is
> > never IOMMU translated. Instead, it's for this:
> >
> > |----------| |---------------| |-------------|
> > | IOMMU |<---<---<| PCI-Controller|<------<-----<----<| PCI device |
> > |----------| |---------------| |-------------|
> > |
> > | |---------------| |-------------|
> > +-->---->--- ->| PCI-Controller|>------>----->---->| PCI device |
> > |---------------| |-------------|
> >
> > It's perfectly fine to not support peer-to-peer DMA, but let's skip it where
> > it's not supported in case it might actually work in some cases.
> >
> > > Also because of some hardware limitations on our IOMMU it is difficult
> > > to map these BAR regions with RAM (DDR) regions. So on Freescale
> > > device we want the VFIO_IOMMU_MAP_DMA ioctl to be called for RAM
> > > regions (DDR) only and _not_ for these mmaped ram regions of PCI
> > > device bars. I can understand that we need to add these mmaped PCI
> > > bars as RAM type in qemu memory_region_*(). So for that I tried to
> > > skip these regions in VFIO memory_listeners. Below changes which works
> > > for me. I am not sure whether this is correct approach, please
> > > suggest.
> > >
> > > -------------
> > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c index c51ae67..63728d8
> > > 100644
> > > --- a/hw/vfio_pci.c
> > > +++ b/hw/vfio_pci.c
> > > @@ -1115,9 +1115,35 @@ static int vfio_dma_map(VFIOContainer *container,
> > hwaddr iova,
> > > return -errno;
> > > }
> > >
> > > -static bool vfio_listener_skipped_section(MemoryRegionSection
> > > *section)
> > > +static int memory_region_is_mmap_bars(VFIOContainer *container,
> > > + MemoryRegionSection *section)
> > > {
> > > - return !memory_region_is_ram(section->mr);
> > > + VFIOGroup *group;
> > > + VFIODevice *vdev;
> > > + int i;
> > > +
> > > + QLIST_FOREACH(group, &container->group_list, next) {
> >
> > I think you want to start at the global &group_list
>
> Why you think global? My understanding is that the operation on dma_map/unmap
> is limited to a group in the given container.
There can theoretically be more than one container, so if you're only
searching groups within a container then you may not be searching all
the devices.
> > > + QLIST_FOREACH(vdev, &group->device_list, next) {
> > > + if (vdev->msix->mmap_mem.ram_addr ==
> > > + section->mr->ram_addr)
> >
> > Note the comment in memory.h:
> >
> > struct MemoryRegion {
> > /* All fields are private - violators will be prosecuted */
> > ...
> > ram_addr_t ram_addr;
> >
> > You'll need to invent some kind of memory region comparison interfaces
> > instead
> > of accessing these private fields.
>
> You mean adding some api ine memory.c?
Yes
> > > + return 1;
> > > + for (i = 0; i < PCI_ROM_SLOT; i++) {
> > > + VFIOBAR *bar = &vdev->bars[i];
> > > + if (bar->mmap_mem.ram_addr == section->mr->ram_addr)
> > > + return 1;
> > > + }
> > > + }
> > > + }
> > > +
> > > + return 0;
> > > +}
> >
> > It's a pretty heavy weight function, I think the memory API should help us
> > differentiate MMIO from RAM.
>
> What about adding a bool in " struct MemoryRegion {"
>
> Bool require_iommu_map;
> There will be APIs to set/clear this bool and default all "ram = true", will
> have "require_iommu_map = true"
>
> if (!(memory_region->ram == true && memory_region->require_iommu_map == true)
> skip_iommumap;
I don't think memory.c wants to care about requiring iommu mapping or
not. Perhaps an mmio address space vs a ram address space.
> >
> > > +static bool vfio_listener_skipped_section(VFIOContainer *container,
> > > + MemoryRegionSection
> > > +*section) {
> > > + if (!memory_region_is_ram(section->mr))
> > > + return 1;
> >
> > Need some kind of conditional here for platforms that support peer-to-peer.
> > Thanks,
>
> What about adding a iommu_type will help here (VFIO_TYPE2_IOMMU), which does
> not require BARs to be mapped in iommu ?
"Type2" seems excessive, there's already an VFIO_IOMMU_GET_INFO ioctl
that could easily add a flag and maybe a field. I think you're doing
something with iommu geometry, perhaps inferring the geometry somehow.
Why not add that to the GET_INFO data and use that to decide whether to
restrict mapping to ram? Thanks,
Alex