qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 12/17] vfio-user: IOMMU support for remote device


From: Igor Mammedov
Subject: Re: [PATCH v7 12/17] vfio-user: IOMMU support for remote device
Date: Tue, 19 Apr 2022 10:48:49 +0200

On Wed, 13 Apr 2022 15:08:33 -0400
Peter Xu <peterx@redhat.com> wrote:

> On Wed, Apr 13, 2022 at 04:37:35PM +0200, Igor Mammedov wrote:
> > On Thu, 31 Mar 2022 08:41:01 -0400
> > Peter Xu <peterx@redhat.com> wrote:
> >   
> > > On Thu, Mar 31, 2022 at 10:47:33AM +0100, Stefan Hajnoczi wrote:  
> > > > On Wed, Mar 30, 2022 at 01:13:03PM -0400, Peter Xu wrote:    
> > > > > On Wed, Mar 30, 2022 at 05:08:24PM +0100, Stefan Hajnoczi wrote:    
> > > > > > On Wed, Mar 30, 2022 at 08:53:16AM -0400, Peter Xu wrote:    
> > > > > > > On Wed, Mar 30, 2022 at 11:04:24AM +0100, Stefan Hajnoczi wrote:  
> > > > > > >   
> > > > > > > > This makes me wonder whether there is a deeper issue with the
> > > > > > > > pci_setup_iommu() API: the lack of per-device cleanup callbacks.
> > > > > > > > Per-device IOMMU resources should be freed when a device is hot
> > > > > > > > unplugged.
> > > > > > > > 
> > > > > > > > From what I can tell this is not the case today:
> > > > > > > > 
> > > > > > > > - hw/i386/intel_iommu.c:vtd_find_add_as() allocates and adds 
> > > > > > > > device
> > > > > > > >   address spaces but I can't find where they are removed and 
> > > > > > > > freed.
> > > > > > > >   VTDAddressSpace instances pointed to from vtd_bus->dev_as[] 
> > > > > > > > are leaked.
> > > > > > > > 
> > > > > > > > - hw/i386/amd_iommu.c has similar leaks.    
> > > > > > > 
> > > > > > > AFAICT it's because there's no device-specific data cached in the
> > > > > > > per-device IOMMU address space, at least so far.  IOW, all the 
> > > > > > > data
> > > > > > > structures allocated here can be re-used when a new device is 
> > > > > > > plugged in
> > > > > > > after the old device unplugged.
> > > > > > > 
> > > > > > > It's definitely not ideal since after unplug (and before a new 
> > > > > > > device
> > > > > > > plugged in) the resource is not needed at all so it's kind of 
> > > > > > > wasted, but
> > > > > > > it should work functionally.  If to achieve that, some 
> > > > > > > iommu_unplug() or
> > > > > > > iommu_cleanup() hook sounds reasonable.    
> > > > > > 
> > > > > > I guess the question is whether PCI busses can be hotplugged with
> > > > > > IOMMUs. If yes, then there is a memory leak that matters for
> > > > > > intel_iommu.c and amd_iommu.c.    
> > > > > 
> > > > > It can't, and we only support one vIOMMU so far for both (commit
> > > > > 1b3bf13890fd849b26).  Thanks,    
> > > > 
> > > > I see, thanks!
> > > > 
> > > > Okay, summarizing options for the vfio-user IOMMU:
> > > > 
> > > > 1. Use the same singleton approach as existing IOMMUs where the
> > > >    MemoryRegion/AddressSpace are never freed. Don't bother deleting.
> > > > 
> > > > 2. Keep the approach in this patch where vfio-user code manually calls a
> > > >    custom delete function (not part of the pci_setup_iommu() API). This
> > > >    is slightly awkward to do without global state and that's what
> > > >    started this discussion.
> > > > 
> > > > 3. Introduce an optional pci_setup_iommu() callback:
> > > > 
> > > >    typdef void (*PCIIOMMUDeviceUnplug)(PCIBus *bus, void *opaque, int 
> > > > devfn);
> > > > 
> > > >    Solves the awkwardness of option #2. Not needed by existing IOMMU
> > > >    devices.    
> > > 
> > > Looks all workable to me.  One tiny thing is if we'd like 3) we may want 
> > > to
> > > pass over the PCIDevice* too because in this case IIUC we'd need to double
> > > check the device class before doing anything - we may not want to call the
> > > vfio-user callbacks for general emulated devices under the same pci bus.
> > > 
> > > I think we could also fetch that from PCIBus.devices[devfn] but that's 
> > > just
> > > not as obvious.
> > > 
> > > Option 4) is as mentioned previously, that we add another device unplug
> > > hook that can be registered per-device.  I just didn't think thoroughly 
> > > on  
> > can you expand on why per device hook is needed?  
> 
> E.g. when the pci bus that contains the vfio-user device also contains
> another emulated device?  Then IIUC we only want to call the vfio-user hook
> for the vfio-user device, not the rest ones on the same bus?
> 
> Per-bus will work too, but again then the per-bus hook will need to first
> identify the PCIDevice* object so it'll work similarly as a per-device hook.

Question is if hook is really needed,
and why doing cleanup in vfio-usr.unrealize() is not sufficient.

also there are realize/unrealize listener hooks, that could
help to hook random code to plug/unplug workflow as the last resort
(i.e. avoid properly dividing responsibility between emulated
device models) on top of that it doesn't have any error handling
so hooks must not fail ever.

(also it's generic vfio issue as it's written as a mix of backend+frontend
code which is confusing at times and makes it hard to distinguish what
belongs where (unless one has an intimate knowledge of how it should
be working))

> > > how it would interact with the current HotplugHandler design yet.. it 
> > > looks
> > > quite similar but so far it's either for the machine type or pci bus, not
> > > capable of registering on one single device (and it's always a mistery to
> > > me why we'd rather ignore the per-bus hook if the machine hook
> > > existed.. that's in qdev_get_hotplug_handler).  
> > 
> > machine hook is there for bus-less devices mainly, if it's not defined
> > code will fallback to bus handler if any exists.
> > 
> > However machine hook can also be used to override default hotplug chain
> > to do to implement non trivial plug/unplug flow.
> > for example see pc_get_hotplug_handler(), corresponding
> > pc_machine_device_[pre_plug|plug|unplug...]_cb() where
> > plug/unplug chain is altered for some PCI devices types.
> > Perhaps the same can be done for vfio.  
> 
> It just seems non-obvious, no?  For example, if someone implementes a pci
> bus with hotplug_handler() being provided, it will surprise me a bit if
> it's triggered conditionally, depending on which machine type the bus is
> attached to, and whether this machine type has get_hotplug_handler().

bus level handler is called anyways, with the difference that plug/unplug
flow could be additionally redirected to other relevant components.

It might be confusing at the beginning, the idea behind, having single
entry point to override flow at machine level, is that it's the top most
object that governs all other devices, wires them together and is allowed
to reshape default wiring between attached devices without violating
relationship between components. (drawback is that approach adds quite
a bit of boilerplate, but no one has suggested/implemented any other
idea for generic device wiring).

Adding specialized hooks to generic bus code for a random device quirks
might work too, but it's not obvious either, scattered through codebase
often polluting  generic code with device specific workarounds, dn it's
harder to review/judge if proposed hook is not just another layer
violation.
Well I don't know enough about vfio/iommu to make useful suggestion
but adding to generic PCI bus/device vfio induced hooks does look
suspicious to me.


> Thanks,
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]