qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/7] virtio: virtio-blk data plane


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH 0/7] virtio: virtio-blk data plane
Date: Tue, 20 Nov 2012 13:21:41 +0100

On Tue, Nov 20, 2012 at 10:02 AM, Asias He <address@hidden> wrote:
> Hello Stefan,
>
> On 11/15/2012 11:18 PM, Stefan Hajnoczi wrote:
>> This series adds the -device virtio-blk-pci,x-data-plane=on property that
>> enables a high performance I/O codepath.  A dedicated thread is used to 
>> process
>> virtio-blk requests outside the global mutex and without going through the 
>> QEMU
>> block layer.
>>
>> Khoa Huynh <address@hidden> reported an increase from 140,000 IOPS to 600,000
>> IOPS for a single VM using virtio-blk-data-plane in July:
>>
>>   http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580
>>
>> The virtio-blk-data-plane approach was originally presented at Linux Plumbers
>> Conference 2010.  The following slides contain a brief overview:
>>
>>   
>> http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf
>>
>> The basic approach is:
>> 1. Each virtio-blk device has a thread dedicated to handling ioeventfd
>>    signalling when the guest kicks the virtqueue.
>> 2. Requests are processed without going through the QEMU block layer using
>>    Linux AIO directly.
>> 3. Completion interrupts are injected via irqfd from the dedicated thread.
>>
>> To try it out:
>>
>>   qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=...
>>        -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on
>
>
> Is this the latest dataplane bits:
> (git://github.com/stefanha/qemu.git virtio-blk-data-plane)
>
> commit 7872075c24fa01c925d4f41faa9d04ce69bf5328
> Author: Stefan Hajnoczi <address@hidden>
> Date:   Wed Nov 14 15:45:38 2012 +0100
>
>     virtio-blk: add x-data-plane=on|off performance feature
>
>
> With this commit on a ramdisk based box, I am seeing about 10K IOPS with
> x-data-plane on and 90K IOPS with x-data-plane off.
>
> Any ideas?
>
> Command line I used:
>
> IMG=/dev/ram0
> x86_64-softmmu/qemu-system-x86_64 \
> -drive file=/root/img/sid.img,if=ide \
> -drive file=${IMG},if=none,cache=none,aio=native,id=disk1 -device
> virtio-blk-pci,x-data-plane=off,drive=disk1,scsi=off \
> -kernel $KERNEL -append "root=/dev/sdb1 console=tty0" \
> -L /tmp/qemu-dataplane/share/qemu/ -nographic -vnc :0 -enable-kvm -m
> 2048 -smp 4 -cpu qemu64,+x2apic -M pc

Was just about to send out the latest patch series which addresses
review comments, so I have tested the latest code
(61b70fef489ce51ecd18d69afb9622c110b9315c).

I was unable to reproduce a ramdisk performance regression on Linux
3.6.6-3.fc18.x86_64 with Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz with
8 GB RAM.

The ramdisk is 4 GB and I used your QEMU command-line with a RHEL 6.3 guest.

Summary results:
x-data-plane-on: iops=132856 aggrb=1039.1MB/s
x-data-plane-off: iops=126236 aggrb=988.40MB/s

virtio-blk-data-plane is ~5% faster in this benchmark.

fio jobfile:
[global]
filename=/dev/vda
blocksize=8k
ioengine=libaio
direct=1
iodepth=8
runtime=120
time_based=1

[reads]
readwrite=randread
numjobs=4

Perf top (data-plane-on):
  3.71%  [kvm]               [k] kvm_arch_vcpu_ioctl_run
  3.27%  [kernel]            [k] memset    <--- ramdisk
  2.98%  [kernel]            [k] do_blockdev_direct_IO
  2.82%  [kvm_intel]         [k] vmx_vcpu_run
  2.66%  [kernel]            [k] _raw_spin_lock_irqsave
  2.06%  [kernel]            [k] put_compound_page
  2.06%  [kernel]            [k] __get_page_tail
  1.83%  [i915]              [k] __gen6_gt_force_wake_mt_get
  1.75%  [kernel]            [k] _raw_spin_unlock_irqrestore
  1.33%  qemu-system-x86_64  [.] vring_pop <--- virtio-blk-data-plane
  1.19%  [kernel]            [k] compound_unlock_irqrestore
  1.13%  [kernel]            [k] gup_huge_pmd
  1.11%  [kernel]            [k] __audit_syscall_exit
  1.07%  [kernel]            [k] put_page_testzero
  1.01%  [kernel]            [k] fget
  1.01%  [kernel]            [k] do_io_submit

Since the ramdisk (memset and page-related functions) is so prominent
in perf top, I also tried a 1-job 8k dd sequential write test on a
Samsung 830 Series SSD where virtio-blk-data-plane was 9% faster than
virtio-blk.  Optimizing against ramdisk isn't a good idea IMO because
it acts very differently from real hardware where the driver relies on
mmio, DMA, and interrupts (vs synchronous memcpy/memset).

Full results:
$ cat data-plane-off
reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
...
reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
fio 1.57
Starting 4 processes

reads: (groupid=0, jobs=1): err= 0: pid=1851
  read : io=29408MB, bw=250945KB/s, iops=31368 , runt=120001msec
    slat (usec): min=2 , max=27829 , avg=11.06, stdev=78.05
    clat (usec): min=1 , max=28028 , avg=241.41, stdev=388.47
     lat (usec): min=33 , max=28035 , avg=253.17, stdev=396.66
    bw (KB/s) : min=197141, max=335365, per=24.78%, avg=250797.02,
stdev=29376.35
  cpu          : usr=6.55%, sys=31.34%, ctx=310932, majf=0, minf=41
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=3764202/0/0, short=0/0/0
     lat (usec): 2=0.01%, 4=0.01%, 20=0.01%, 50=1.78%, 100=27.11%
     lat (usec): 250=38.97%, 500=27.11%, 750=2.09%, 1000=0.71%
     lat (msec): 2=1.32%, 4=0.70%, 10=0.20%, 20=0.01%, 50=0.01%
reads: (groupid=0, jobs=1): err= 0: pid=1852
  read : io=29742MB, bw=253798KB/s, iops=31724 , runt=120001msec
    slat (usec): min=2 , max=17007 , avg=10.61, stdev=67.51
    clat (usec): min=1 , max=41531 , avg=239.00, stdev=379.03
     lat (usec): min=32 , max=41547 , avg=250.33, stdev=385.21
    bw (KB/s) : min=194336, max=347497, per=25.02%, avg=253204.25,
stdev=31172.37
  cpu          : usr=6.66%, sys=32.58%, ctx=327250, majf=0, minf=41
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=3806999/0/0, short=0/0/0
     lat (usec): 2=0.01%, 20=0.01%, 50=1.54%, 100=26.45%, 250=40.04%
     lat (usec): 500=27.15%, 750=1.95%, 1000=0.71%
     lat (msec): 2=1.29%, 4=0.68%, 10=0.18%, 20=0.01%, 50=0.01%
reads: (groupid=0, jobs=1): err= 0: pid=1853
  read : io=29859MB, bw=254797KB/s, iops=31849 , runt=120001msec
    slat (usec): min=2 , max=16821 , avg=11.35, stdev=76.54
    clat (usec): min=1 , max=17659 , avg=237.25, stdev=375.31
     lat (usec): min=31 , max=17673 , avg=249.27, stdev=383.62
    bw (KB/s) : min=194864, max=345280, per=25.15%, avg=254534.63,
stdev=30549.32
  cpu          : usr=6.52%, sys=31.84%, ctx=303763, majf=0, minf=39
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=3821989/0/0, short=0/0/0
     lat (usec): 2=0.01%, 10=0.01%, 20=0.01%, 50=2.09%, 100=29.19%
     lat (usec): 250=37.31%, 500=26.41%, 750=2.08%, 1000=0.71%
     lat (msec): 2=1.32%, 4=0.70%, 10=0.20%, 20=0.01%
reads: (groupid=0, jobs=1): err= 0: pid=1854
  read : io=29598MB, bw=252565KB/s, iops=31570 , runt=120001msec
    slat (usec): min=2 , max=26413 , avg=11.21, stdev=78.32
    clat (usec): min=16 , max=27993 , avg=239.56, stdev=381.67
     lat (usec): min=34 , max=28006 , avg=251.49, stdev=390.13
    bw (KB/s) : min=194256, max=369424, per=24.94%, avg=252462.86,
stdev=29420.58
  cpu          : usr=6.57%, sys=31.33%, ctx=305623, majf=0, minf=41
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=3788507/0/0, short=0/0/0
     lat (usec): 20=0.01%, 50=2.13%, 100=28.30%, 250=37.74%, 500=26.66%
     lat (usec): 750=2.17%, 1000=0.75%
     lat (msec): 2=1.35%, 4=0.70%, 10=0.19%, 20=0.01%, 50=0.01%

Run status group 0 (all jobs):
   READ: io=118607MB, aggrb=988.40MB/s, minb=256967KB/s,
maxb=260912KB/s, mint=120001msec, maxt=120001msec

Disk stats (read/write):
  vda: ios=15148328/0, merge=0/0, ticks=1550570/0, in_queue=1536232, util=96.56%

$ cat data-plane-on
reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
...
reads: (g=0): rw=randread, bs=8K-8K/8K-8K, ioengine=libaio, iodepth=8
fio 1.57
Starting 4 processes

reads: (groupid=0, jobs=1): err= 0: pid=1796
  read : io=32081MB, bw=273759KB/s, iops=34219 , runt=120001msec
    slat (usec): min=1 , max=20404 , avg=21.08, stdev=125.49
    clat (usec): min=10 , max=135743 , avg=207.62, stdev=532.90
     lat (usec): min=21 , max=136055 , avg=229.60, stdev=556.82
    bw (KB/s) : min=56480, max=951952, per=25.49%, avg=271488.81,
stdev=149773.57
  cpu          : usr=7.01%, sys=43.26%, ctx=336854, majf=0, minf=41
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=4106413/0/0, short=0/0/0
     lat (usec): 20=0.01%, 50=2.46%, 100=61.13%, 250=21.58%, 500=3.11%
     lat (usec): 750=3.04%, 1000=3.88%
     lat (msec): 2=4.50%, 4=0.13%, 10=0.11%, 20=0.06%, 50=0.01%
     lat (msec): 250=0.01%
reads: (groupid=0, jobs=1): err= 0: pid=1797
  read : io=30104MB, bw=256888KB/s, iops=32110 , runt=120001msec
    slat (usec): min=1 , max=17595 , avg=22.20, stdev=120.29
    clat (usec): min=13 , max=136264 , avg=221.21, stdev=528.19
     lat (usec): min=22 , max=136280 , avg=244.35, stdev=551.73
    bw (KB/s) : min=57312, max=838880, per=23.93%, avg=254798.51,
stdev=139546.57
  cpu          : usr=6.82%, sys=41.87%, ctx=360348, majf=0, minf=41
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=3853351/0/0, short=0/0/0
     lat (usec): 20=0.01%, 50=2.10%, 100=58.47%, 250=22.38%, 500=3.68%
     lat (usec): 750=3.69%, 1000=4.52%
     lat (msec): 2=4.87%, 4=0.14%, 10=0.11%, 20=0.05%, 250=0.01%
reads: (groupid=0, jobs=1): err= 0: pid=1798
  read : io=31698MB, bw=270487KB/s, iops=33810 , runt=120001msec
    slat (usec): min=1 , max=17457 , avg=20.93, stdev=125.33
    clat (usec): min=16 , max=134663 , avg=210.19, stdev=535.77
     lat (usec): min=21 , max=134671 , avg=232.02, stdev=559.27
    bw (KB/s) : min=57248, max=841952, per=25.29%, avg=269330.21,
stdev=148661.08
  cpu          : usr=6.92%, sys=42.81%, ctx=337799, majf=0, minf=39
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=4057340/0/0, short=0/0/0
     lat (usec): 20=0.01%, 50=1.98%, 100=62.00%, 250=20.70%, 500=3.22%
     lat (usec): 750=3.23%, 1000=4.16%
     lat (msec): 2=4.41%, 4=0.13%, 10=0.10%, 20=0.06%, 250=0.01%
reads: (groupid=0, jobs=1): err= 0: pid=1799
  read : io=30913MB, bw=263789KB/s, iops=32973 , runt=120000msec
    slat (usec): min=1 , max=17565 , avg=21.52, stdev=120.17
    clat (usec): min=15 , max=136064 , avg=215.53, stdev=529.56
     lat (usec): min=27 , max=136070 , avg=237.99, stdev=552.50
    bw (KB/s) : min=57632, max=900896, per=24.74%, avg=263431.57,
stdev=148379.15
  cpu          : usr=6.90%, sys=42.56%, ctx=348217, majf=0, minf=41
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w/d: total=3956830/0/0, short=0/0/0
     lat (usec): 20=0.01%, 50=1.76%, 100=59.96%, 250=22.21%, 500=3.45%
     lat (usec): 750=3.35%, 1000=4.33%
     lat (msec): 2=4.65%, 4=0.13%, 10=0.11%, 20=0.05%, 250=0.01%

Run status group 0 (all jobs):
   READ: io=124796MB, aggrb=1039.1MB/s, minb=263053KB/s,
maxb=280328KB/s, mint=120000msec, maxt=120001msec

Disk stats (read/write):
  vda: ios=15942789/0, merge=0/0, ticks=336240/0, in_queue=317832, util=97.47%



reply via email to

[Prev in Thread] Current Thread [Next in Thread]