[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default
From: |
Longpeng (Mike) |
Subject: |
Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default |
Date: |
Wed, 29 Nov 2017 15:38:26 +0800 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) Gecko/20120327 Thunderbird/11.0.1 |
On 2017/11/29 14:01, Roman Kagan wrote:
> On Wed, Nov 29, 2017 at 01:20:38PM +0800, Longpeng (Mike) wrote:
>> On 2017/11/29 5:13, Eduardo Habkost wrote:
>>
>>> [CCing the people who were copied in the original patch that
>>> enabled l3cache]
>>>
>>> On Tue, Nov 28, 2017 at 11:20:27PM +0300, Denis V. Lunev wrote:
>>>> On 11/28/2017 10:58 PM, Eduardo Habkost wrote:
>>>>> Hi,
>>>>>
>>>>> On Fri, Nov 24, 2017 at 04:26:50PM +0300, Denis Plotnikov wrote:
>>>>>> Commit 14c985cffa "target-i386: present virtual L3 cache info for vcpus"
>>>>>> introduced and set by default exposing l3 to the guest.
>>>>>>
>>>>>> The motivation behind it was that in the Linux scheduler, when waking up
>>>>>> a task on a sibling CPU, the task was put onto the target CPU's runqueue
>>>>>> directly, without sending a reschedule IPI. Reduction in the IPI count
>>>>>> led to performance gain.
>>>>>>
>>>>>> However, this isn't the whole story. Once the task is on the target
>>>>>> CPU's runqueue, it may have to preempt the current task on that CPU, be
>>>>>> it the idle task putting the CPU to sleep or just another running task.
>>>>>> For that a reschedule IPI will have to be issued, too. Only when that
>>>>>> other CPU is running a normal task for too little time, the fairness
>>>>>> constraints will prevent the preemption and thus the IPI.
>>>>>>
>>
>> Agree. :)
>>
>> Our testing VM is Suse11 guest with idle=poll at that time and now I realize
> ^^^^^^^^^
> Oh, that's a whole lot of a difference! I wish you mentioned that in
> that patch.
>
:( Sorry for missing that...
>> that Suse11 has a BUG in its scheduler.
>>
>> For REHL 7.3 or upstream kernel, in ttwu_queue_remote(), a RES IPI is issued
>> if
>> rq->idle is not polling:
>> '''
>> static void ttwu_queue_remote(struct task_struct *p, int cpu)
>> {
>> struct rq *rq = cpu_rq(cpu);
>>
>> if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) {
>> if (!set_nr_if_polling(rq->idle))
>> smp_send_reschedule(cpu);
>> else
>> trace_sched_wake_idle_without_ipi(cpu);
>> }
>> }
>> '''
>>
>> But for Suse11, it does not check, it send a RES IPI unconditionally.
>>
>>>>>> This boils down to the improvement being only achievable in workloads
>>>>>> with many actively switching tasks. We had no access to the
>>>>>> (proprietary?) SAP HANA benchmark the commit referred to, but the
>>>>>> pattern is also reproduced with "perf bench sched messaging -g 1"
>>>>>> on 1 socket, 8 cores vCPU topology, we see indeed:
>>>>>>
>>>>>> l3-cache #res IPI /s #time / 10000 loops
>>>>>> off 560K 1.8 sec
>>>>>> on 40K 0.9 sec
>>>>>>
>>>>>> Now there's a downside: with L3 cache the Linux scheduler is more eager
>>>>>> to wake up tasks on sibling CPUs, resulting in unnecessary cross-vCPU
>>>>>> interactions and therefore exessive halts and IPIs. E.g. "perf bench
>>>>>> sched pipe -i 100000" gives
>>>>>>
>>>>>> l3-cache #res IPI /s #HLT /s #time /100000 loops
>>>>>> off 200 (no K) 230 0.2 sec
>>>>>> on 400K 330K 0.5 sec
>>>>>>
>>
>> I guess this issue could be resolved by disable the SD_WAKE_AFFINE.
>
> But that requires extra tuning in the guest which is even less likely to
> happen in the cloud case when VM admin != host admin.
>
Ah, yep, that's a problem.
>> As Gonglei said:
>> 1. the L3 cache relates to the user experience.
>> 2. the glibc would get the cache info by CPUID directly, and relates to the
>> memory performance.
>>
>> What's more, the L3 cache relates to the sched_domain which is important to
>> the
>> (load) balancer when system is busy.
>>
>> All this doesn't mean the patch is insignificant, I just think we should do
>> more
>> research before decide. I'll do some tests, thanks. :)
>
> Looking forward to it, thanks!
> Roman.
>
>
--
Regards,
Longpeng(Mike)
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, (continued)
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Eduardo Habkost, 2017/11/29
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Longpeng (Mike), 2017/11/29
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Roman Kagan, 2017/11/29
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Eduardo Habkost, 2017/11/29
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Paolo Bonzini, 2017/11/29
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Roman Kagan, 2017/11/30
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Longpeng (Mike), 2017/11/30
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Roman Kagan, 2017/11/29
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Eduardo Habkost, 2017/11/29
- Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default, Michael S. Tsirkin, 2017/11/28