qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [PATCH 0/3 QEMU v4] vfio on ppc64


From: David Gibson
Subject: Re: [Qemu-ppc] [PATCH 0/3 QEMU v4] vfio on ppc64
Date: Tue, 23 Apr 2013 09:44:17 +1000
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, Apr 22, 2013 at 12:39:26PM -0600, Alex Williamson wrote:
> On Mon, 2013-04-22 at 18:02 +1000, Alexey Kardashevskiy wrote:
> > Yes, we are still tuning this stuff for us :)
> 
> Yay!
> 
> > Changes:
> > 
> > * new "spapr-pci-vfio-host-bridge" device is introduced
> > (inherits from spapr-pci-host-bridge);
> > 
> > * device#1->group->container->device#2,.. creation scheme is changed to
> > group->container->devices scheme (on ppc64);
> > 
> > * removed vfio_iommu_spapr_tce_dma_map/vfio_iommu_spapr_tce_dma_unmap
> > as they are not really necessary now and it does not look like we will
> > need them in the nearest future.
> > 
> > Comments?
> 
> Not happy with this... :(

So, part of that will be my fault, not Alexey's, because of how I told
him to do this.

> > Alexey Kardashevskiy (3):
> >   vfio: make some of VFIO API public
> 
> Exports and conditionalizes vfio_get_group for no good reason other than
> to get a pointer to a group that's not connected to a container.  Not
> only is the bool parameter ugly, it has a poor interface - call it once
> with "false" and get an unconnected group, call it for the same group
> with "true", and still get back an unconnected group.  I'd say
> vfio_get_group should be split to have group and container connect
> separate, but I don't think you even need that...

The boolean is ugly, yes.  I was suggesting splitting a group_create()
function from get_group().

> >   vfio: add support for spapr platform (powerpc64 server)
> 
> ...because this patch ignores the extensibility of
> vfio_connect_container.  There is absolutely no reason for
> vfio_container_spapr_alloc to exist.  It duplicates most of
> vfio_connect_container, apparently using the attempt to reuse containers
> as an excuse, even though this should happily fall through unless the
> host code is broken.  vfio supports containers capable of multiple
> groups, it does not assume it or require it.

So, initially Alexey did extend vfio_connect_container(), but I think
this is wrong.  vfio_connect_container() does something different from
what we want on pseries, and not just because of details of the IOMMU
interface.

Apart from the construction of the container object (which could be
done by a shared helper), the guts of vfio_connect_container() does
two things:
        a) Finds a container for the group.  But it assumes any
container is suitable, if it will accept the group.  That works for
x86 because every container has the same mappings - all of RAM.  Even
if we were able to extend the kernel interface to support multiple
groups per container it would be wrong for pseries though: the PAPR
interface requires that each guest host bridge have independent DMA
mappings.  So the group *must* be bound to the container associated
with this device's guest host bridge, no other choice can be correct.

        b) It wires up the container to the MemorListener, so that it
maps all RAM.  Our model doesn't do that.

Note that both these considerations would become relevant for x86,
too, if and when a guest-visible IOMMU interface is implemented: you
wouldn't want to wire up a memory listener, and the (guest) device
would have to be wired to the container associated with its guest
IOMMU domain, not any accepting container.

The fundamental point is we have two very different models of using
the IOMMU, the "map all" and "guest visible" models..  And while it
happens that the PAPR IOMMU has constraints which mean it can't be
used in map all mode, and we never currently use the x86 IOMMUs in
guest visible mode, there's nothing to prevent a future IOMMU being
usable in either mode.  In fact there's nothing preventing using Type1
in either mode right now, except for the exercise of coding the guest
IOMMU interface into qemu.  So the model of IOMMU usage should not be
treated as simply a detail of the IOMMU type.

Now, what I do wonder is if we ought to have two different qemu device
types for a vfio device in map all mode, which will find and prepare
its own container and one in guest visible mode which must be told
which container to use, and fail if it can't add itself to that
container.

> >   spapr vfio: add spapr-pci-vfio-host-bridge to support vfio
> 
> And obviously this is the consumer of all that, which really just wants
> some parameters from the iommu info ioctl and a way to call map and
> unmap for dma.  What if instead of all of the above, we simply:
> 
> 1) export vfio_dma_map/vfio_dma_unmap
> 2) add spapr support to vfio_connect_container (skip the info and enable
> chunks)
> 2) add to vfio.c and export:
> 
> VFIOContainer *vfio_spapr_php_init(int groupid, struct 
> vfio_iommu_spapr_tce_info *info)
> {
>     vfio_get_group(groupid); // including connect
>     ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, info);
>     ioctl(container->fd, VFIO_IOMMU_ENABLE);
>     return group->container;
> }
> 
> Isn't that all you really need?  Thanks,
> 
> Alex
> 
> >  hw/misc/vfio.c              |   94 ++++++++++++++++++++--
> >  hw/ppc/spapr_iommu.c        |  151 +++++++++++++++++++++++++++--------
> >  hw/ppc/spapr_pci.c          |  186 
> > +++++++++++++++++++++++++++++++++++++++++--
> >  include/hw/misc/vfio.h      |   20 +++++
> >  include/hw/pci-host/spapr.h |   12 +++
> >  include/hw/ppc/spapr.h      |    3 +
> >  trace-events                |    4 +
> >  7 files changed, 422 insertions(+), 48 deletions(-)
> >  create mode 100644 include/hw/misc/vfio.h
> > 
> 
> 
> 
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]