qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode
Date: Fri, 01 Aug 2014 16:17:29 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0

Il 01/08/2014 15:48, Ming Lei ha scritto:
> On Fri, Aug 1, 2014 at 9:13 PM, Stefan Hajnoczi <address@hidden> wrote:
>> On Fri, Aug 01, 2014 at 10:54:02AM +0800, Ming Lei wrote:
>>> On Fri, Aug 1, 2014 at 12:30 AM, Paolo Bonzini <address@hidden> wrote:
>>>> Il 31/07/2014 18:13, Ming Lei ha scritto:
>>>>> Follows 'perf report' result on cycles event for with/without bypass
>>>>> coroutine:
>>>>>
>>>>>     http://pastebin.com/ae0vnQ6V
>>>>>
>>>>> From the profiling result, looks bdrv_co_do_preadv() is a bit slow
>>>>> without bypass coroutine.
>>>>
>>>> Yeah, I can count at least 3.3% time spent here:
>>>>
>>>> 0.87%          bdrv_co_do_preadv
>>>> 0.79%          bdrv_aligned_preadv
>>>> 0.71%          qemu_coroutine_switch
>>>> 0.52%          tracked_request_begin
>>>> 0.45%          coroutine_swap
>>>>
>>>> Another ~3% wasted in malloc, etc.
>>>
>>> That should be related with coroutine and the BH in bdrv_co_do_rw().
>>> In this post I didn't apply Stephan's coroutine resize patch which might
>>> decrease usage of malloc() for coroutine.
>>
>> Please rerun with "[PATCH v3 0/2] coroutine: dynamically scale pool
>> size".
> 
> No problem, will do that. Actually in my last post with rfc, this patchset
> was against your coroutine resize patches.
> 
> I will provide the profile data tomorrow.
> 
>>
>>> At least, coroutine isn't cheap from the profile result.
>>
>> Instead of bypassing coroutines we should first understand the overhead
>> that they impose.  Is it due to the coroutine implementation (switching
>> stacks) or due to the bdrv_co_*() code that happens to use coroutines
>> but slow for other reasons.
> 
> From the 3th patch(block: support to bypass qemu coroutinue)
> and the 5th patch(dataplane: enable selective bypassing coroutine),
> the change is to bypass coroutine and BH, and the other bdrv code
> path is same, so it is due to the coroutine implementation, IMO.

But your code breaks all sort of invariants.  For example, the aiocb
must be valid when bdrv_aio_readv/writev return.  virtio-blk does not
use it, but virtio-scsi does.  If we apply your patches now, we will
have to redo it soon.

Basically we should be rewriting parts of block.c so that
bdrv_co_readv/writev calls bdrv_aio_readv/writev instead of vice versa.
 Coroutine creation should be pushed down to the
bdrv_aligned_preadv/bdrv_aligned_pwritev and, in the fast path, you can
simply call the driver's bdrv_aio_readv/writev.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]