Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1

From:	Blue Swirl
Subject:	Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1
Date:	Thu, 27 Nov 2008 21:14:45 +0200

On 11/27/08, Andrea Arcangeli <address@hidden> wrote:
> Hello everyone,
>
>  One major limitation for KVM today is the lack of a proper way to
>  write drivers in a way that allows the host OS to use direct DMA to
>  the guest physical memory to avoid any intermediate copy. The only API
>  provided to drivers seems to be the cpu_physical_memory_rw and that
>  enforces all drivers to bounce and trash cpu caches and be memory
>  bound. This new DMA API instead allows drivers to use a pci_dma_sg
>  method for SG I/O that will translate the guest physical addresses to
>  host virutal addresses and it will call two operation, one is a submit
>  method and one is the complete method. The pci_dma_sg may have to
>  bounce buffer internally and to limit the max bounce size it may have
>  to submit I/O in pieces with multiple submit calls. The patch adapts
>  the ide.c HD driver to use this. Once cdrom is converted too
>  dma_buf_rw can be eliminated. As you can see the new ide_dma_submit
>  and ide_dma_complete code is much more readable than the previous
>  rearming callback.
>
>  This is only tested with KVM so far but qemu builds, in general
>  there's nothing kvm specific here (with the exception of a single
>  kvm_enabled), so it should all work well for both.
>
>  All we care about is the performance of the direct path, so I tried to
>  avoid dynamic allocations there to avoid entering glibc, the current
>  logic doesn't satisfy me yet but it should be at least faster than
>  calling malloc (but I'm still working on it to avoid memory waste to
>  detect when more than one iov should be cached). But in case of
>  instabilities I recommend first thing to set MAX_IOVEC_IOVCNT 0 to
>  disable that logic ;). I recommend to test with DEBUG_BOUNCE and with
>  a 512 max bounce buffer too. It's running stable in all modes so
>  far. However if ide.c end up calling aio_cancel things will likely
>  fall apart but this is all because of bdrv_aio_readv/writev, and the
>  astonishing lack of aio_readv/writev in glibc!
>
>  Once we finish fixing storage performance with a real
>  bdrv_aio_readv/writev (now a blocker issue), a pci_dma_single can be
>  added for zero copy networking (one NIC per VM, or VMDq, IOV
>  etc..). The DMA API should allow for that too.

The previous similar attempt by Anthony for generic DMA using vectored
IO was abandoned because the malloc/free overhead was more than the
performance gain. Have you made any performance measurements? How does
this version compare to the previous ones?

I think the pci_ prefix can be removed, there is little PCI specific.

For Sparc32 IOMMU (and probably other IOMMUS), it should be possible
to register a function used in place of  cpu_physical_memory_rw,
c_p_m_can_dma etc. The goal is that it should be possible to stack the
DMA resolvers (think of devices behind a number of buses).

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/27
- [Qemu-devel] [RFC 2/2] bdrv_aio_readv/writev_em, Andrea Arcangeli, 2008/11/27
  - Re: [Qemu-devel] [RFC 2/2] bdrv_aio_readv/writev_em, Jamie Lokier, 2008/11/28
- Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Blue Swirl <=
  - Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/27
    - Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Blue Swirl, 2008/11/28
    - Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/28
    - Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Blue Swirl, 2008/11/28
    - Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Jamie Lokier, 2008/11/28
    - Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Avi Kivity, 2008/11/29
    - Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/30
    - Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Anthony Liguori, 2008/11/30
    - Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1, Andrea Arcangeli, 2008/11/30
    - [Qemu-devel] [RFC 1/1] pci-dma-api-v2, Andrea Arcangeli, 2008/11/30

Prev by Date: Re: [Bochs-developers] [Qemu-devel] [PATCH v5 0/5] Support for S3 ACPI state (suspend to memory) in BIOS
Next by Date: Re: [Qemu-devel] possible bug in block-vmdk.c
Previous by thread: Re: [Qemu-devel] [RFC 2/2] bdrv_aio_readv/writev_em
Next by thread: Re: [Qemu-devel] [RFC 1/2] pci-dma-api-v1
Index(es):
- Date
- Thread