[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH 2/3] raw-posix: Convert Linux AIO submission
From: |
Ming Lei |
Subject: |
Re: [Qemu-devel] [RFC PATCH 2/3] raw-posix: Convert Linux AIO submission to coroutines |
Date: |
Fri, 28 Nov 2014 10:59:33 +0800 |
Hi Kevin,
On Wed, Nov 26, 2014 at 10:46 PM, Kevin Wolf <address@hidden> wrote:
> This improves the performance of requests because an ACB doesn't need to
> be allocated on the heap any more. It also makes the code nicer and
> smaller.
I am not sure it is good way for linux aio optimization:
- for raw image with some constraint, coroutine can be avoided since
io_submit() won't sleep most of times
- handling one time coroutine takes much time than handling malloc,
memset and free on small buffer, following the test data:
-- 241ns per coroutine
-- 61ns per (malloc, memset, free for 128bytes)
I still think we should figure out a fast path to avoid cocourinte
for linux-aio with raw image, otherwise it can't scale well for high
IOPS device.
Also we can use simple buf pool to avoid the dynamic allocation
easily, can't we?
>
> As a side effect, the codepath taken by aio=threads is changed to use
> paio_submit_co(). This doesn't change the performance at this point.
>
> Results of qemu-img bench -t none -c 10000000 [-n] /dev/loop0:
>
> | aio=native | aio=threads
> | before | with patch | before | with patch
> ------+----------+------------+----------+------------
> run 1 | 29.921s | 26.932s | 35.286s | 35.447s
> run 2 | 29.793s | 26.252s | 35.276s | 35.111s
> run 3 | 30.186s | 27.114s | 35.042s | 34.921s
> run 4 | 30.425s | 26.600s | 35.169s | 34.968s
> run 5 | 30.041s | 26.263s | 35.224s | 35.000s
>
> TODO: Do some more serious benchmarking in VMs with less variance.
> Results of a quick fio run are vaguely positive.
I will do the test with Paolo's fast path approach under
VM I/O situation.
Thanks,
Ming Lei