qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V2 1/1] linux-aio: prevent submitting more than


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH V2 1/1] linux-aio: prevent submitting more than MAX_EVENTS
Date: Fri, 15 Jul 2016 12:37:02 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1


On 15/07/2016 12:17, Roman Penyaev wrote:
> On Fri, Jul 15, 2016 at 11:58 AM, Paolo Bonzini <address@hidden> wrote:
>>
>>
>> On 15/07/2016 11:18, Roman Penyaev wrote:
>>> Those 3 red spikes and a blue hill is what we have to focus on.  The
>>> blue hill at the right corner of the chart means that almost always the
>>> ring buffer was observed as full, i.e. qemu_laio_completion_bh() got
>>> a chance to reap completions not very often, meanwhile completed
>>> requests stand in the ring buffer for quite a long time which degrades
>>> the overall performance.
>>>
>>> The results covered by the red line are much better and that can be
>>> explained by those 3 red spikes, which are almost in the middle of the
>>> whole distribution, i.e. qemu_laio_completion_bh() is called more often,
>>> completed requests do not stall, giving fio a chance to submit new fresh
>>> requests.
>>>
>>> The theoretical fix would be to schedule completion BH just after
>>> successful io_submit, i.e.:
>>
>> What about removing the qemu_bh_cancel but keeping the rest of the patch?
> 
> That exactly what I did.  Numbers go to expected from ~1600MB/s to ~1800MB/s.
> So basically this hunk of the debatable patch:
> 
>      if (event_notifier_test_and_clear(&s->e)) {
> -        qemu_bh_schedule(s->completion_bh);
> +        qemu_laio_completion_bh(s);
>      }
> 
> does not have any impact and can be ignored.  At least I did not notice
> anything important.
> 
>>
>> I'm also interested in a graph with this patch ("linux-aio: prevent
>> submitting more than MAX_EVENTS") on top of origin/master.
> 
> I can plot it also of course.
> 
>>
>> Thanks for the analysis.  Sometimes a picture _is_ worth a thousand
>> words, even if it's measuring "only" second-order effects (# of
>> completions is not what causes the slowdown, but # of completions
>> affects latency which causes the slowdown).
> 
> Yes, you are right, latency.  With userspace io_getevents ~0 costs we
> can peek requests as often as we like to decrease latency on very
> fast devices.  That can also bring something.  Probably after each
> io_submit() it makes sense to peek and complete something.

Right, especially 1) because io_getevents with timeout 0 is cheap (it
peeks at the ring buffer before the syscall); 2) because we want anyway
to replace io_getevents with userspace code through your other patch.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]