qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm


From: Dongsu Park
Subject: Re: [Qemu-devel] virtio-blk performance regression and qemu-kvm
Date: Tue, 21 Feb 2012 16:57:25 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Hi Stefan,
see below.

On 13.02.2012 11:57, Stefan Hajnoczi wrote:
> On Fri, Feb 10, 2012 at 2:36 PM, Dongsu Park
> <address@hidden> wrote:
> >  Now I'm running benchmarks with both qemu-kvm 0.14.1 and 1.0.
> >
> >  - Sequential read (Running inside guest)
> >   # fio -name iops -rw=read -size=1G -iodepth 1 \
> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
> >
> >  - Sequential write (Running inside guest)
> >   # fio -name iops -rw=write -size=1G -iodepth 1 \
> >    -filename /dev/vdb -ioengine libaio -direct=1 -bs=4096
> >
> >  For each one, I tested 3 times to get the average.
> >
> >  Result:
> >
> >  seqread with qemu-kvm 0.14.1   67,0 MByte/s
> >  seqread with qemu-kvm 1.0      30,9 MByte/s
> >
> >  seqwrite with qemu-kvm 0.14.1  65,8 MByte/s
> >  seqwrite with qemu-kvm 1.0     30,5 MByte/s
> 
> Please retry with the following commit or simply qemu-kvm.git/master.
> Avi discovered a performance regression which was introduced when the
> block layer was converted to use coroutines:
> 
> $ git describe 39a7a362e16bb27e98738d63f24d1ab5811e26a8
> v1.0-327-g39a7a36
> 
> (This commit is not in 1.0!)
> 
> Please post your qemu-kvm command-line.
> 
> 67 MB/s sequential 4 KB read means 67 * 1024 / 4 = 17152 requests per
> second, so 58 microseconds per request.
> 
> Please post the fio output so we can double-check what is reported.

As you mentioned above, I tested it again with the revision
v1.0-327-g39a7a36, which includes the commit 39a7a36.

Result is though still not good enough.

seqread   : 20.3 MByte/s
seqwrite  : 20.1 MByte/s
randread  : 20.5 MByte/s
randwrite : 20.0 MByte/s

My qemu-kvm commandline is like below:

=======================================================================
/usr/bin/kvm -S -M pc-0.14 -enable-kvm -m 1024 \
-smp 1,sockets=1,cores=1,threads=1 -name mydebian3_8gb \
-uuid d99ad012-2fcc-6f7e-fbb9-bc48b424a258 -nodefconfig -nodefaults \
-chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/mydebian3_8gb.monitor,server,nowait
 \
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown \
-drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw \
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 \
-drive 
file=/var/lib/libvirt/images/mydebian3_8gb.img,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native
 \
-device 
virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 \
-drive 
file=/dev/ram0,if=none,id=drive-virtio-disk1,format=raw,cache=none,aio=native \
-device 
virtio-blk-pci,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 \
-netdev tap,fd=19,id=hostnet0 \
-device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:68:9f:d0,bus=pci.0,addr=0x3 
\
-chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 \
-usb -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus \
-device AC97,id=sound0,bus=pci.0,addr=0x4 \
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
=======================================================================

As you see, /dev/ram0 is being mapped to /dev/vdb on the guest side,
which is used for fio tests.

Here is a sample of fio output:

=======================================================================
# fio -name iops -rw=read -size=1G -iodepth 1 -filename /dev/vdb \
-ioengine libaio -direct=1 -bs=4096
iops: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=1
Starting 1 process
Jobs: 1 (f=1): [R] [100.0% done] [21056K/0K /s] [5140/0 iops] [eta
00m:00s]
iops: (groupid=0, jobs=1): err= 0: pid=1588
  read : io=1024MB, bw=20101KB/s, iops=5025, runt= 52166msec
    slat (usec): min=4, max=6461, avg=24.00, stdev=19.75
    clat (usec): min=0, max=11934, avg=169.49, stdev=113.91
    bw (KB/s) : min=18200, max=23048, per=100.03%, avg=20106.31, stdev=934.42
  cpu          : usr=5.43%, sys=23.25%, ctx=262363, majf=0, minf=28
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued r/w: total=262144/0, short=0/0
     lat (usec): 2=0.01%, 4=0.16%, 10=0.03%, 20=0.01%, 50=0.27%
     lat (usec): 100=4.07%, 250=89.12%, 500=5.76%, 750=0.30%, 1000=0.13%
     lat (msec): 2=0.12%, 4=0.02%, 10=0.01%, 20=0.01%

Run status group 0 (all jobs):
   READ: io=1024MB, aggrb=20100KB/s, minb=20583KB/s, maxb=20583KB/s,
mint=52166msec, maxt=52166msec

Disk stats (read/write):
  vdb: ios=261308/0, merge=0/0, ticks=40210/0, in_queue=40110, util=77.14%
=======================================================================


So I think, the patch for coroutine-ucontext isn't about the bottleneck
I'm looking for.

Regards,
Dongsu

p.s. Sorry for the late reply. Last week I was on vacation.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]