[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6
From: |
Ming Lei |
Subject: |
Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2 |
Date: |
Sat, 28 Jun 2014 02:01:32 +0800 |
On Fri, Jun 27, 2014 at 8:01 PM, Stefan Hajnoczi <address@hidden> wrote:
> On Thu, Jun 26, 2014 at 11:14:16PM +0800, Ming Lei wrote:
>> Hi Stefan,
>>
>> I found VM block I/O thoughput is decreased by more than 40%
>> on my laptop, and looks much worsen in my server environment,
>> and it is caused by your commit 580b6b2aa2:
>>
>> dataplane: use the QEMU block layer for I/O
>>
>> I run fio with below config to test random read:
>>
>> [global]
>> direct=1
>> size=4G
>> bsrange=4k-4k
>> timeout=20
>> numjobs=4
>> ioengine=libaio
>> iodepth=64
>> filename=/dev/vdc
>> group_reporting=1
>>
>> [f]
>> rw=randread
>>
>> Together with throughput drop, the latency is improved a little.
>>
>> With this commit, I/O block submitted to fs becomes much smaller
>> than before, and more io_submit() need to be called to kernel, that
>> means iodepth may become much less.
>>
>> I am not surprised with the result since I did compare VM I/O
>> performance between qemu and lkvm before, which has no big qemu
>> lock problem and handle I/O in a dedicated thread, but lkvm's block
>> IO is still much worse than qemu from view of throughput, because
>> lkvm doesn't submit block I/O at batch like the way of previous
>> dataplane, IMO.
>>
>> But now you change the way of submitting I/O, could you share
>> the motivation about the change? Is the throughput drop you expect?
>
> Thanks for reporting this. 40% is a serious regression.
>
> We were expecting a regression since the custom Linux AIO codepath has
> been replaced with the QEMU block layer (which offers features like
> image formats, snapshots, I/O throttling).
>
> Let me know if you get stuck working on a patch. Implementing batching
> sounds like a good idea. I never measured the impact when I wrote the
> ioq code, it just seemed like a natural way to structure the code.
I just implemented plug&unplug based batching, and it is working now.
But throughout still has no obvious improvement.
Looks loading in IOthread is a bit low, so I am wondering if there is
block point caused by Qemu QEMU block layer.
> Hopefully this 40% number is purely due to batching and we can get most
> of the performance back.
I will double check it, but based on my previous comparison between
lkvm and qemu, and batching is the only difference.
Thanks,
--
Ming Lei
- Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, (continued)
- Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, Paolo Bonzini, 2014/06/26
- Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, Ming Lei, 2014/06/26
- Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, Paolo Bonzini, 2014/06/27
- Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, Kevin Wolf, 2014/06/27
- Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, Paolo Bonzini, 2014/06/27
- Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, Ming Lei, 2014/06/27
- Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, Ming Lei, 2014/06/27
Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, Stefan Hajnoczi, 2014/06/27