qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 00/11] virtio: virtio-blk data plane


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH v4 00/11] virtio: virtio-blk data plane
Date: Thu, 29 Nov 2012 10:18:59 +0100

On Thu, Nov 22, 2012 at 4:16 PM, Stefan Hajnoczi <address@hidden> wrote:
> This series adds the -device virtio-blk-pci,x-data-plane=on property that
> enables a high performance I/O codepath.  A dedicated thread is used to 
> process
> virtio-blk requests outside the global mutex and without going through the 
> QEMU
> block layer.
>
> Khoa Huynh <address@hidden> reported an increase from 140,000 IOPS to 600,000
> IOPS for a single VM using virtio-blk-data-plane in July:
>
>   http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580
>
> The virtio-blk-data-plane approach was originally presented at Linux Plumbers
> Conference 2010.  The following slides contain a brief overview:
>
>   
> http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf
>
> The basic approach is:
> 1. Each virtio-blk device has a thread dedicated to handling ioeventfd
>    signalling when the guest kicks the virtqueue.
> 2. Requests are processed without going through the QEMU block layer using
>    Linux AIO directly.
> 3. Completion interrupts are injected via irqfd from the dedicated thread.
>
> To try it out:
>
>   qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=...
>        -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on
>
> Limitations:
>  * Only format=raw is supported
>  * Live migration is not supported
>  * Block jobs, hot unplug, and other operations fail with -EBUSY
>  * I/O throttling limits are ignored
>  * Only Linux hosts are supported due to Linux AIO usage
>
> The code has reached a stage where I feel it is ready to merge.  Users have
> been playing with it for some time and want the significant performance boost.
>
> We are refactoring QEMU to get rid of the global mutex.  I believe that
> virtio-blk-data-plane can eventually become the default mode of operation.
>
> Instead of waiting for global mutex removal efforts to finish, I want to use
> virtio-blk-data-plane as an example device for AioContext and threaded hw
> dispatch refactoring.  This means:
>
> 1. When the block layer can bind to an AioContext and execute I/O outside the
>    global mutex, virtio-blk-data-plane can use this (and gain image format
>    support).
>
> 2. When hw dispatch no longer needs the global mutex we can use hw/virtio.c
>    again and perhaps run a pool of iothreads instead of dedicated data plane
>    threads.
>
> But in the meantime, I have cleaned up the virtio-blk-data-plane code so that
> it can be merged as an experimental feature.
>
> v4:
>  * Add qemu_iovec_concat_iov() [Paolo]
>  * Use QEMUIOVector to copy out virtio_blk_inhdr [Michael, Paolo]
>
> v3:
>  * Don't assume iovec layout [Michael]
>  * Better naming for hostmem.c MemoryListener callbacks [Don]
>  * More vring quarantining if commands are bogus instead of exiting [Blue]
>
> v2:
>  * Use MemoryListener for thread-safe memory mapping [Paolo, Anthony, and 
> everyone else pointed this out ;-)]
>  * Quarantine invalid vring instead of exiting [Blue]
>  * Replace __u16 kernel types with uint16_t [Blue]
>
> Changes from the RFC v9:
>  * Add x-data-plane=on|off option and coexist with regular virtio-blk code
>  * Create thread from BH so it inherits iothread cpusets
>  * Drain requests on vm_stop() so stopped guest does not access image file
>  * Add migration blocker
>  * Add bdrv_in_use() to prevent block jobs and other operations that can 
> interfere
>  * Drop IOQueue request merging for simplicity
>  * Drop ioctl interrupt injection and always use irqfd for simplicity
>  * Major cleanup to split up source files
>  * Rebase from qemu-kvm.git onto qemu.git
>  * Address Michael Tsirkin's review comments
>
> Stefan Hajnoczi (11):
>   raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane
>   configure: add CONFIG_VIRTIO_BLK_DATA_PLANE
>   dataplane: add host memory mapping code
>   dataplane: add virtqueue vring code
>   dataplane: add event loop
>   dataplane: add Linux AIO request queue
>   iov: add iov_discard() to remove data
>   test-iov: add iov_discard() testcase
>   iov: add qemu_iovec_concat_iov()
>   dataplane: add virtio-blk data plane code
>   virtio-blk: add x-data-plane=on|off performance feature
>
>  block.h                    |   9 +
>  block/raw-posix.c          |  34 ++++
>  configure                  |  21 +++
>  hw/Makefile.objs           |   2 +-
>  hw/dataplane/Makefile.objs |   3 +
>  hw/dataplane/event-poll.c  | 109 ++++++++++++
>  hw/dataplane/event-poll.h  |  40 +++++
>  hw/dataplane/hostmem.c     | 165 ++++++++++++++++++
>  hw/dataplane/hostmem.h     |  52 ++++++
>  hw/dataplane/ioq.c         | 118 +++++++++++++
>  hw/dataplane/ioq.h         |  57 ++++++
>  hw/dataplane/virtio-blk.c  | 427 
> +++++++++++++++++++++++++++++++++++++++++++++
>  hw/dataplane/virtio-blk.h  |  41 +++++
>  hw/dataplane/vring.c       | 344 ++++++++++++++++++++++++++++++++++++
>  hw/dataplane/vring.h       |  62 +++++++
>  hw/virtio-blk.c            |  59 ++++++-
>  hw/virtio-blk.h            |   1 +
>  hw/virtio-pci.c            |   3 +
>  iov.c                      |  80 +++++++--
>  iov.h                      |  13 ++
>  qemu-common.h              |   3 +
>  tests/test-iov.c           | 129 ++++++++++++++
>  trace-events               |   9 +
>  23 files changed, 1767 insertions(+), 14 deletions(-)
>  create mode 100644 hw/dataplane/Makefile.objs
>  create mode 100644 hw/dataplane/event-poll.c
>  create mode 100644 hw/dataplane/event-poll.h
>  create mode 100644 hw/dataplane/hostmem.c
>  create mode 100644 hw/dataplane/hostmem.h
>  create mode 100644 hw/dataplane/ioq.c
>  create mode 100644 hw/dataplane/ioq.h
>  create mode 100644 hw/dataplane/virtio-blk.c
>  create mode 100644 hw/dataplane/virtio-blk.h
>  create mode 100644 hw/dataplane/vring.c
>  create mode 100644 hw/dataplane/vring.h

Michael, Paolo: Are you happy with v4?

Kevin: Do you want to take this series through the block tree?

Thanks,
Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]