[Qemu-devel] Very poor IO performance which looks like some design probl

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Very poor IO performance which looks like some design probl

From:	ein
Subject:	[Qemu-devel] Very poor IO performance which looks like some design problem.
Date:	Fri, 10 Apr 2015 22:38:25 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.4.0

Hello Group.

Let me describe my setup first.
Storage base are 6xSAS drives in RAID50 in IBM x3650 M3, there's
LSI ServeRAID M5015 controller (FW Package Build: 12.13.0-0179).

Disk specs: http://www.cnet.com/products/seagate-savvio-10k-4-600gb-sas-2/specs/

I've created 6xRAID0 devices from above SAS drives. The reason was poor performance of the controller itself in every possible RAID level. Every virtual volume drive which is member of my raid looks like:

Virtual Drive: 3 (Target Id: 3)Name                :RAID Level          : Primary-0, Secondary-0, RAID Level Qualifier-0Size                : 557.861 GBSector Size         : 512Parity Size         : 0State               : OptimalStrip Size          : 128 KBNumber Of Drives    : 1Span Depth          : 1Default Cache Policy: WriteBack, ReadAheadNone, Cached, No Write Cache if Bad BBUCurrent Cache Policy: WriteBack, ReadAheadNone, Cached, No Write Cache if Bad BBUDefault Access Policy: Read/WriteCurrent Access Policy: Read/WriteDisk Cache Policy   : EnabledEncryption Type     : NoneIs VD Cached: No

On that 6 RAID0 volumes I've created softraid (mdadm, Debian 8, testing) 2xRAID5 and I stripped it, which resulted in creation of RAID50 array:

Personalities : [raid6] [raid5] [raid4] [raid0] md0 : active raid0 md2[1] md1[0]      2339045376 blocks super 1.2 512k chunks      md2 : active raid5 sdg1[2] sdf1[1] sde1[0]      1169653760 blocks super 1.2 level 5, 128k chunk, algorithm 2 [3/3] [UUU]      bitmap: 1/5 pages [4KB], 65536KB chunkmd1 : active raid5 sdd1[2] sdc1[1] sdb1[0]      1169653760 blocks super 1.2 level 5, 128k chunk, algorithm 2 [3/3] [UUU]      bitmap: 1/5 pages [4KB], 65536KB chunk

On that raid, I've created ext2 fs:
mkfs.ext2 -b 4096 -E stride=128,stripe-width=512 -vvm1 /dev/mapper/hdd-images -i 4194304

Small benchmarks of sequential read and write (20GiB with echo 3 > /proc/sys/vm/drop_caches before every test):
1. Filesystem benchmark:
read 380 MB/s, write 200MB/s
2. LVM volume benchmark:
read 409 MB/s, could not do write test
3. RAID device test:
423 MB/s
4. When I was reading continuously from 4 SAS virtual drives using dd then I was able to hit bottleneck of the controller (6GB/s) easly.

I've installed Windows 2012 server, and I've very big problems with optimal configuration, which allows me to maximize total troughput. Best performance I've got in that configuration:

qemu-system-x86_64 -enable-kvm -name XXXX -S -machine pc-1.1,accel=kvm,usb=off -cpu host -m 16000 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid d0e14081-b4a0-23b5-ae39-110a686b0e55 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/acm-server.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \     -drive file=/var/lib/libvirt/images/xxx.img,if=none,id=drive-virtio-disk0,format=raw,cache=unsafe \     -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1\     -drive file=/dev/mapper/hdd-storage,if=none,id=drive-virtio-disk1,format=raw,cache=unsafe \     -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 \     -drive file=/var/lib/libvirt/images-hdd/storage.img,if=none,id=drive-virtio-disk2,format=raw,cache=unsafe \     -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk2,id=virtio-disk2 \ -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:f5:b5:b7,bus=pci.0,addr=0x3 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -msg timestamp=on

I was able to get 150MB/s sequential read in VM. Then I've discovered something extraordinary, when I limited CPU count to one, not four like before, disk throughput was almost two times bigger. Then I've realized something:

Qemu creates more than 70 threads and everyone of them tries to write to disk, which results in:
1. High I/O time.
2. Large latency.
2. Poor sequential read/write speeds.

When I limited number of cores, I guess I limited number of threads as well. That's why I got better numbers.

I've tried to combine AIO native and thread setting with deadline scheduler. Native AIO was much more worse.

The final question, is there any way to prevent Qemu for making so large number of processes when VM does only one sequential R/W operation?

0xF2C6EA10.asc
Description: application/pgp-keys

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] Very poor IO performance which looks like some design problem., ein <=
- Re: [Qemu-devel] Very poor IO performance which looks like some design problem., Paolo Bonzini, 2015/04/11
  - Re: [Qemu-devel] Very poor IO performance which looks like some design problem., ein, 2015/04/11
    - Re: [Qemu-devel] Very poor IO performance which looks like some design problem., ein, 2015/04/11
- Re: [Qemu-devel] Very poor IO performance which looks like some design problem., Fam Zheng, 2015/04/12
  - Re: [Qemu-devel] Very poor IO performance which looks like some design problem., ein, 2015/04/13
    - Re: [Qemu-devel] Very poor IO performance which looks like some design problem., Paolo Bonzini, 2015/04/13
    - Re: [Qemu-devel] Very poor IO performance which looks like some design problem., Kevin Wolf, 2015/04/14

Prev by Date: [Qemu-devel] Failing iotests in v2.3.0-rc2 / master
Next by Date: Re: [Qemu-devel] [PATCH for-2.3] cris: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory
Previous by thread: [Qemu-devel] Failing iotests in v2.3.0-rc2 / master
Next by thread: Re: [Qemu-devel] Very poor IO performance which looks like some design problem.
Index(es):
- Date
- Thread