qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] device assignment for embedded Power


From: Yoder Stuart-B08248
Subject: Re: [Qemu-devel] device assignment for embedded Power
Date: Tue, 5 Jul 2011 18:19:01 +0000


> -----Original Message-----
> From: Benjamin Herrenschmidt [mailto:address@hidden
> Sent: Thursday, June 30, 2011 7:58 PM
> To: Yoder Stuart-B08248
> Cc: address@hidden; Wood Scott-B07421; Alexander Graf; address@hidden;
> address@hidden; address@hidden; address@hidden; address@hidden;
> address@hidden; address@hidden
> Subject: Re: device assignment for embedded Power
> 
> On Thu, 2011-06-30 at 15:59 +0000, Yoder Stuart-B08248 wrote:
> > One feature we need for QEMU/KVM on embedded Power Architecture is the
> > ability to do passthru assignment of SoC I/O devices and memory.  An
> > important use case in embedded is creating static partitions-- taking
> > physical memory and I/O devices (non-PCI) and partitioning
> > them between the host Linux and several virtual machines.   Things like
> > live migration would not be needed or supported in these types of scenarios.
> >
> > SoC devices do not sit on a probeable bus and there are no identifiers
> > like 01:00.0 with PCI that we can use to identify devices--  the host
> > Linux kernel is made aware of SoC I/O devices from nodes/properties in a
> > device tree structure passed at boot.   QEMU needs to generate a
> > device tree to pass to the guest as well with all the guest's virtual
> > and physical resources.  Today a number of mostly complete guest
> > device trees are kept under ./pc-bios in QEMU, but this too static and
> > inflexible.
> >
> > Some new mechanism is needed to assign SoC devices to guests, and we
> > (FSL + Alex Graf) have been discussing a few possible approaches for
> > doing this from QEMU and would like some feedback.
> >
> > Some possibilities:
> >
> > 1. Option 1.  Pass the host dev tree to QEMU and assign devices
> >    by device tree path
> >
> >      -dtb ./mpc8572ds.dtb -device assigned-soc-dev,dev=/soc/address@hidden
> >
> >    /soc/address@hidden is the device tree path to the assigned device.
> >    The device node 'address@hidden' has some number of properties (e.g.
> >    address, interrupt info) and possibly subnodes under
> >    it.   QEMU copies that node when generating the guest dev tree.
> >    See snippet of entire node:  http://paste2.org/p/1496460
> 
> Yuck (see below)
> 
> > 2. Option 2.  Pass the entire assigned device node as a string to
> >    QEMU
> >
> >      -device assigned-soc-dev,dev=/address@hidden,dev-node='#address-cells 
> > = <1>;
> >       #size-cells = <0>; cell-index = <0>; compatible = "fsl-i2c";
> >       reg = <0xffe03000 0x100>; interrupts = <43 2>;
> >       interrupt-parent = <&mpic>; dfsrr;'
> 
> Beuark ! (see below)
> 
> >    This avoids needing to pass the host device tree, but could
> >    get awkward-- the i2c example above is very simple, some device
> >    nodes are very large with a complex hierarchy of subnodes and
> >    could be hundreds of lines of text to represent a single
> >    node.
> >
> > It gets more complicated...
> 
> 
> So, from a qemu command line perspective, all you should have to do is pass 
> qemu the device-
> tree -path- to the device you want to pass-trough (you may support passing a 
> full hierarchy
> here).
> 
> That is for normal MMIO mapped SoC devices. Something else (individual i2c, 
> usb, ...) will use
> specific virtualization of the corresponding busses.

Then why 'yuck' to option 1 :)?   That is basically what was being proposed.

> Anything else sucks too much really.
> 
> From there, well, there's several approach inside qemu/kvm to handle that 
> path. If you want to
> do things at the qemu level you can probably parse /proc/device-tree. But I'd 
> personally just
> make it a kernel thing.
>
> IE. I would have an ioctl to "instanciate" a pass-through device, that takes 
> that path as an
> argument. I would make it return an anonymous fd which you can then use to 
> mmap the resources,
> etc...

Regarding implementation I think there are 3 things that need
to be set up--  1) mmapping the device's registers, 2) getting the iommu
set up (if there is one), 3) getting the interrupt(s) handled.

> > In some cases, modifications to device tree nodes may be needed.
> > An example-- sometimes a device tree property references another node
> > and that relationship may not exist when assigned to a guest.
> > A "phy-handle" property may need to be deleted and a "fixed-link"
> > property added to a node representing a network device.
> 
> That's fishy. Why wouldn't you give full access to the MDIO ? It's shared ? 
> Such things are so
> device-specific that they would have to be handled by device-specific quirks, 
> which can live
> either in qemu or in the kernel.

It is shared and in this case didn't want the phy shared.   That was a super
simple example to illustrate the idea.  With our experience with the Freescale
Embedded Hypervisor we see this as a definite requirement-- nodes in the
hardware device may need modifications.  In the P4080 device tree there
are some complex relationships expressed between nodes of our 'data
path'.   In some cases the hardware device tree expresses configuration
information, and while it could be argued that config info does not belong
there, it's what some drivers expect and what we have right now.   So, a 
mechanism
to allow node updates is really needed.

> > So in addition to assigning a device, a mechanism is needed to update
> > device tree nodes.  So for the above example, maybe--
> >
> >  -device assigned-soc-dev,dev=/soc/address@hidden,delete-prop=phy-handle,
> >   node-update="fixed-link = <2 1 1000 0 0>"
> 
> That's just so gross and error prone, borderline insane.

Not going to argue the gross/insane part, but it's reality.  Don't
think anyone would type all that in at the command line, but would
be in an init script or something, so don't see it being more error
prone than messing around with device trees in general.

There's a small set of operations needed, based on our experience:
   -adding,deleting properties (including phandle references)
   -adding,deleting nodes (including subtrees)

Stuart

reply via email to

[Prev in Thread] Current Thread [Next in Thread]