qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Very poor IO performance which looks like some design probl


From: ein
Subject: [Qemu-devel] Very poor IO performance which looks like some design problem.
Date: Fri, 10 Apr 2015 22:38:25 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.4.0

Hello Group.

Let me describe my setup first.
Storage base are 6xSAS drives in RAID50 in IBM x3650 M3, there's
LSI ServeRAID M5015 controller (FW Package Build: 12.13.0-0179).

Disk specs: http://www.cnet.com/products/seagate-savvio-10k-4-600gb-sas-2/specs/

I've created 6xRAID0 devices from above SAS drives. The reason was poor performance of the controller itself in every possible RAID level. Every virtual volume drive which is member of my raid looks like:

Virtual Drive: 3 (Target Id: 3)
Name                :
RAID Level          : Primary-0, Secondary-0, RAID Level Qualifier-0
Size                : 557.861 GB
Sector Size         : 512
Parity Size         : 0
State               : Optimal
Strip Size          : 128 KB
Number Of Drives    : 1
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAheadNone, Cached, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Cached, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Enabled
Encryption Type     : None
Is VD Cached: No

On that 6 RAID0 volumes I've created softraid (mdadm, Debian 8, testing) 2xRAID5 and I stripped it, which resulted in creation of RAID50 array:

Personalities : [raid6] [raid5] [raid4] [raid0]
md0 : active raid0 md2[1] md1[0]
      2339045376 blocks super 1.2 512k chunks
     
md2 : active raid5 sdg1[2] sdf1[1] sde1[0]
      1169653760 blocks super 1.2 level 5, 128k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 1/5 pages [4KB], 65536KB chunk

md1 : active raid5 sdd1[2] sdc1[1] sdb1[0]
      1169653760 blocks super 1.2 level 5, 128k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 1/5 pages [4KB], 65536KB chunk

On that raid, I've created ext2 fs:
mkfs.ext2 -b 4096 -E stride=128,stripe-width=512 -vvm1 /dev/mapper/hdd-images -i 4194304

Small benchmarks of sequential read and write (20GiB with echo 3 > /proc/sys/vm/drop_caches before every test):
1. Filesystem benchmark:
read 380 MB/s, write 200MB/s
2. LVM volume benchmark:
read 409 MB/s, could not do write test
3. RAID device test:
423 MB/s
4. When I was reading continuously from 4 SAS virtual drives using dd then I was able to hit bottleneck of the controller (6GB/s) easly.

I've installed Windows 2012 server, and I've very big problems with optimal configuration, which allows me to maximize total troughput. Best performance I've got in that configuration:

qemu-system-x86_64 -enable-kvm -name XXXX -S -machine pc-1.1,accel=kvm,usb=off -cpu host -m 16000 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid d0e14081-b4a0-23b5-ae39-110a686b0e55 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/acm-server.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \
    -drive file=/var/lib/libvirt/images/xxx.img,if=none,id=drive-virtio-disk0,format=raw,cache=unsafe \
    -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1\
    -drive file=/dev/mapper/hdd-storage,if=none,id=drive-virtio-disk1,format=raw,cache=unsafe \
    -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 \
    -drive file=/var/lib/libvirt/images-hdd/storage.img,if=none,id=drive-virtio-disk2,format=raw,cache=unsafe \
    -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk2,id=virtio-disk2 \
 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f5:b5:b7,bus=pci.0,addr=0x3 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on


I was able to get 150MB/s sequential read in VM. Then I've discovered something extraordinary, when I limited CPU count to one, not four like before, disk throughput was almost two times bigger. Then I've realized something:



Qemu creates more than 70 threads and everyone of them tries to write to disk, which results in:
1. High I/O time.
2. Large latency.
2. Poor sequential read/write speeds.

When I limited number of cores, I guess I limited number of threads as well. That's why I got better numbers.

I've tried to combine AIO native and thread setting with deadline scheduler. Native AIO was much more worse.

The final question, is there any way to prevent Qemu for making so large number of processes when VM does only one sequential R/W operation?





Attachment: 0xF2C6EA10.asc
Description: application/pgp-keys


reply via email to

[Prev in Thread] Current Thread [Next in Thread]