qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] A question about PCI device address spaces


From: David Gibson
Subject: Re: [Qemu-devel] A question about PCI device address spaces
Date: Mon, 26 Dec 2016 22:40:15 +1100
User-agent: Mutt/1.7.1 (2016-10-04)

On Mon, Dec 26, 2016 at 01:01:34PM +0200, Marcel Apfelbaum wrote:
> On 12/22/2016 11:42 AM, Peter Xu wrote:
> > Hello,
> > 
> 
> Hi Peter,
> 
> > Since this is a general topic, I picked it out from the VT-d
> > discussion and put it here, just want to be more clear of it.
> > 
> > The issue is, whether we have exposed too much address spaces for
> > emulated PCI devices?
> > 
> > Now for each PCI device, we are having PCIDevice::bus_master_as for
> > the device visible address space, which derived from
> > pci_device_iommu_address_space():
> > 
> > AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
> > {
> >     PCIBus *bus = PCI_BUS(dev->bus);
> >     PCIBus *iommu_bus = bus;
> > 
> >     while(iommu_bus && !iommu_bus->iommu_fn && iommu_bus->parent_dev) {
> >         iommu_bus = PCI_BUS(iommu_bus->parent_dev->bus);
> >     }
> >     if (iommu_bus && iommu_bus->iommu_fn) {
> >         return iommu_bus->iommu_fn(bus, iommu_bus->iommu_opaque, 
> > dev->devfn);
> >     }
> >     return &address_space_memory;
> > }
> > 
> > By default (for no-iommu case), it's pointed to system memory space,
> > which includes MMIO, and looks wrong - PCI device should not be able to
> > write to MMIO regions.
> > 
> 
> Why? As far as I know a PCI device can start a read/write transaction
> to virtually any address, it doesn't matter if it 'lands' in RAM or a MMIO
> region mapped by other device. But I might be wrong, need to read the spec 
> again...

So as I noted in another mail, my earlier comment which led Peter to
say that was misleading.  In particular I was talking about *non PCI*
MMIO devices, which barely exist on x86 (and even there the statement
won't necessarily be true).

> The PCI transaction will eventually reach the Root Complex/PCI host bridge
> where an IOMMU or some other hw entity can sanitize/translate, but is out of
> the scope of the device itself.

Right, but we're not talking about the device, or purely within PCI
address space.  We're explicitly talking about what addresses the
RC/host bridge will translate between PCI space and CPU address space.
I'm betting that even on x86, it won't be the whole 64-bit address
space (otherwise how would the host bridge know whether another PCI
device might be listening on that address).

> The Root Complex will 'translate' the transaction into a memory read/write
> in the behalf of the device and pass it to the memory controller.
> If the transaction target is another device, I am not sure if the
> Root Complex will re-route by itself or pass it to the Memory Controller.

It will either re-route itself, or simply drop it, possibly depending
on configuration.  I'm sure the MC won't be bouncing transactions back
to PCI space.  Note that for vanilla PCI the question is moot - the
cycle will be broadcast on the bus segment and something will pick it
up - either a device or the host bridge.  If multiple things try to
respond to the same addresses, things will go badly wrong.

> > As an example, if we dump a PCI device address space into detail on
> > x86_64 system, we can see (this is address space for a virtio-net-pci
> > device on an Q35 machine with 6G memory):
> > 
> >     0000000000000000-000000000009ffff (prio 0, RW): pc.ram
> >     00000000000a0000-00000000000affff (prio 1, RW): vga.vram
> >     00000000000b0000-00000000000bffff (prio 1, RW): vga-lowmem
> >     00000000000c0000-00000000000c9fff (prio 0, RW): pc.ram
> >     00000000000ca000-00000000000ccfff (prio 0, RW): pc.ram
> >     00000000000cd000-00000000000ebfff (prio 0, RW): pc.ram
> >     00000000000ec000-00000000000effff (prio 0, RW): pc.ram
> >     00000000000f0000-00000000000fffff (prio 0, RW): pc.ram
> >     0000000000100000-000000007fffffff (prio 0, RW): pc.ram
> >     00000000b0000000-00000000bfffffff (prio 0, RW): pcie-mmcfg-mmio
> >     00000000fd000000-00000000fdffffff (prio 1, RW): vga.vram
> >     00000000fe000000-00000000fe000fff (prio 0, RW): virtio-pci-common
> >     00000000fe001000-00000000fe001fff (prio 0, RW): virtio-pci-isr
> >     00000000fe002000-00000000fe002fff (prio 0, RW): virtio-pci-device
> >     00000000fe003000-00000000fe003fff (prio 0, RW): virtio-pci-notify
> >     00000000febd0400-00000000febd041f (prio 0, RW): vga ioports remapped
> >     00000000febd0500-00000000febd0515 (prio 0, RW): bochs dispi interface
> >     00000000febd0600-00000000febd0607 (prio 0, RW): qemu extended regs
> >     00000000febd1000-00000000febd102f (prio 0, RW): msix-table
> >     00000000febd1800-00000000febd1807 (prio 0, RW): msix-pba
> >     00000000febd2000-00000000febd2fff (prio 1, RW): ahci
> >     00000000fec00000-00000000fec00fff (prio 0, RW): kvm-ioapic
> >     00000000fed00000-00000000fed003ff (prio 0, RW): hpet
> >     00000000fed1c000-00000000fed1ffff (prio 1, RW): lpc-rcrb-mmio
> >     00000000fee00000-00000000feefffff (prio 4096, RW): kvm-apic-msi
> >     00000000fffc0000-00000000ffffffff (prio 0, R-): pc.bios
> >     0000000100000000-00000001ffffffff (prio 0, RW): pc.ram
> > 
> > So here are the "pc.ram" regions the only ones that we should expose
> > to PCI devices? (it should contain all of them, including the low-mem
> > ones and the >=4g one)
> > 
> 
> As I previously said, it does not have to be RAM only, but let's wait
> also for Michael's opinion.
> 
> > And, should this rule work for all platforms?
> 
> The PCI rules should be generic for all platforms, but I don't know
> the other platforms.

The rules *within the PCI address space* will be common across
platforms.  But we're discussing the host bridge and the rules across
the PCI/host interface.  This behaviour - what address ranges will be
forwarded in which direction, for example - can and does vary
significantly by platform.

> 
> Thanks,
> Marcel
> 
> Or say, would it be a
> > problem if I directly change address_space_memory in
> > pci_device_iommu_address_space() into something else, which only
> > contains RAMs? (of course this won't affect any platform that has
> > IOMMU, aka, customized PCIBus::iommu_fn function)
> > 
> > (btw, I'd appreciate if anyone has quick answer on why we have lots of
> >  continuous "pc.ram" in low 2g range - from can_merge() I guess they
> >  seem to have different dirty_log_mask, romd_mode, etc., but I still
> >  would like to know why they are having these difference. Anyway, this
> >  is totally an "optional question" just to satisfy my own curiosity :)
> > 
> > Thanks,
> > 
> > -- peterx
> > 
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]