[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [PATCH qemu v16 01/19] vfio: Delay DMA address space list
From: |
Alex Williamson |
Subject: |
Re: [Qemu-ppc] [PATCH qemu v16 01/19] vfio: Delay DMA address space listener release |
Date: |
Fri, 13 May 2016 16:24:53 -0600 |
On Fri, 13 May 2016 17:16:48 +1000
Alexey Kardashevskiy <address@hidden> wrote:
> On 05/06/2016 08:39 AM, Alex Williamson wrote:
> > On Wed, 4 May 2016 16:52:13 +1000
> > Alexey Kardashevskiy <address@hidden> wrote:
> >
> >> This postpones VFIO container deinitialization to let region_del()
> >> callbacks (called via vfio_listener_release) do proper clean up
> >> while the group is still attached to the container.
> >
> > Any mappings within the container should clean themselves up when the
> > container is deprivleged by removing the last group in the kernel. Is
> > the issue that that doesn't happen, which would be a spapr vfio kernel
> > bug, or that our QEMU side structures get all out of whack if we let
> > that happen?
>
> My mailbase got corrupted, missed that.
>
> This is mostly for "[PATCH qemu v16 17/19] spapr_iommu, vfio, memory:
> Notify IOMMU about starting/stopping being used by VFIO", I should have put
> 01/19 and 02/19 right before 17/19, sorry about that.
Which I object to, it's just ridiculous to have vfio start/stop
callbacks in a set of generic iommu region ops.
> Every reboot the spapr machine removes all (i.e. one or two) windows and
> creates the default one.
>
> I do this by memory_region_del_subregion(iommu_mr) +
> memory_region_add_subregion(iommu_mr). Which gets translated to
> VFIO_IOMMU_SPAPR_TCE_REMOVE + VFIO_IOMMU_SPAPR_TCE_CREATE via
> vfio_memory_listener if there is VFIO; no direct calls from spapr to vfio
> => cool. During the machine reset, the VFIO device is there with the
> container and groups attached, at some point with no windows.
>
> Now to VFIO plug/unplug.
>
> When VFIO plug happens, vfio_memory_listener is created, region_add() is
> called, the hardware window is created (via VFIO_IOMMU_SPAPR_TCE_CREATE).
> Unplugging should end up doing VFIO_IOMMU_SPAPR_TCE_REMOVE somehow. If
> region_del() is not called when the container is being destroyed (as before
> this patchset), then the kernel cleans and destroys windows when
> close(container->fd) is called or when qemu is killed (and this fd is
> naturally closed), I hope this answers the comment from 02/19.
>
> So far so good (right?)
>
> However I also have a guest view of the TCE table, this is what the guest
> sees and this is what emulated PCI devices use. This guest view is either
> allocated in the KVM (so H_PUT_TCE can be handled quickly right in the host
> kernel, even in real mode) or userspace (VFIO case).
>
> I generally want the guest view to be in the KVM. However when I plug VFIO,
> I have to move the table to the userspace. When I unplug VFIO, I want to do
> the opposite so I need a way to tell spapr that it can move the table.
> region_del() seemed a natural way of doing this as region_add() is already
> doing the opposite part.
>
> With this patchset, each IOMMU MR gets a usage counter, region_add() does
> +1, region_del() does -1 (yeah, not extremely optimal during reset). When
> the counter goes from 0 to 1, vfio_start() hook is called, when the counter
> becomes 0 - vfio_stop(). Note that we may have multiple VFIO containers on
> the same PHB.
>
> Without 01/19 and 02/19, I'll have to repeat region_del()'s counter
> decrement steps in vfio_disconnect_container(). And I still cannot move
> counting from region_add() to vfio_connect_container() so there will be
> asymmetry which I am fine with, I am just checking here - what would be the
> best approach here?
You're imposing on other iommu models (type1) that in order to release
a container we first deregister the listener, which un-plays all of
the mappings within that region. That's inefficient when we can simply
unset the container and move on. So you're imposing an inefficiency on
a separate vfio iommu model for the book keeping of your own. I don't
think that's a reasonable approach. Has it even been testing how that
affects type1 users? When a container is closed, clearly it shouldn't
be contributing to reference counts, so it seems like there must be
other ways to handle this.
> >>
> >> Signed-off-by: Alexey Kardashevskiy <address@hidden>
> >> ---
> >> hw/vfio/common.c | 22 +++++++++++++++-------
> >> 1 file changed, 15 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> >> index fe5ec6a..0b40262 100644
> >> --- a/hw/vfio/common.c
> >> +++ b/hw/vfio/common.c
> >> @@ -921,23 +921,31 @@ static void vfio_disconnect_container(VFIOGroup
> >> *group)
> >> {
> >> VFIOContainer *container = group->container;
> >>
> >> - if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, &container->fd)) {
> >> - error_report("vfio: error disconnecting group %d from container",
> >> - group->groupid);
> >> - }
> >> -
> >> QLIST_REMOVE(group, container_next);
> >> +
> >> + if (QLIST_EMPTY(&container->group_list)) {
> >> + VFIOGuestIOMMU *giommu;
> >> +
> >> + vfio_listener_release(container);
> >> +
> >> + QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
> >> + memory_region_unregister_iommu_notifier(&giommu->n);
> >> + }
> >> + }
> >> +
> >> group->container = NULL;
> >> + if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, &container->fd)) {
> >> + error_report("vfio: error disconnecting group %d from container",
> >> + group->groupid);
> >> + }
> >>
> >> if (QLIST_EMPTY(&container->group_list)) {
> >> VFIOAddressSpace *space = container->space;
> >> VFIOGuestIOMMU *giommu, *tmp;
> >>
> >> - vfio_listener_release(container);
> >> QLIST_REMOVE(container, next);
> >>
> >> QLIST_FOREACH_SAFE(giommu, &container->giommu_list, giommu_next,
> >> tmp) {
> >> - memory_region_unregister_iommu_notifier(&giommu->n);
> >> QLIST_REMOVE(giommu, giommu_next);
> >> g_free(giommu);
> >> }
> >
> > I'm not spotting why this is a 2-pass process vs simply moving the
> > existing QLIST_EMPTY cleanup above the ioctl. Thanks,
>
>
>
>
>
- [Qemu-ppc] [PATCH qemu v16 18/19] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2), (continued)
- [Qemu-ppc] [PATCH qemu v16 18/19] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2), Alexey Kardashevskiy, 2016/05/04
- Re: [Qemu-ppc] [PATCH qemu v16 18/19] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2), Alex Williamson, 2016/05/13
- Re: [Qemu-ppc] [PATCH qemu v16 18/19] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2), Alexey Kardashevskiy, 2016/05/16
- Re: [Qemu-ppc] [PATCH qemu v16 18/19] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2), Alex Williamson, 2016/05/16
- Re: [Qemu-ppc] [PATCH qemu v16 18/19] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2), David Gibson, 2016/05/26
- Re: [Qemu-ppc] [PATCH qemu v16 18/19] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2), Alexey Kardashevskiy, 2016/05/26
- Re: [Qemu-ppc] [PATCH qemu v16 18/19] vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2), David Gibson, 2016/05/27
[Qemu-ppc] [PATCH qemu v16 01/19] vfio: Delay DMA address space listener release, Alexey Kardashevskiy, 2016/05/04
[Qemu-ppc] [PATCH qemu v16 06/19] spapr_pci: Use correct DMA LIOBN when composing the device tree, Alexey Kardashevskiy, 2016/05/04
[Qemu-ppc] [PATCH qemu v16 15/19] spapr_pci: Add and export DMA resetting helper, Alexey Kardashevskiy, 2016/05/04
[Qemu-ppc] [PATCH qemu v16 07/19] spapr_iommu: Move table allocation to helpers, Alexey Kardashevskiy, 2016/05/04
[Qemu-ppc] [PATCH qemu v16 13/19] memory: Add reporting of supported page sizes, Alexey Kardashevskiy, 2016/05/04
Re: [Qemu-ppc] [PATCH qemu v16 00/19] spapr: vfio: Enable Dynamic DMA windows (DDW), Alexey Kardashevskiy, 2016/05/13