[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH 2/3] cpus-common: Cache allocated work items
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [RFC PATCH 2/3] cpus-common: Cache allocated work items |
Date: |
Tue, 29 Aug 2017 22:38:50 +0200 |
Il 28 ago 2017 11:43 PM, "Pranith Kumar" <address@hidden> ha scritto:
On Mon, Aug 28, 2017 at 1:47 PM, Richard Henderson
<address@hidden> wrote:
> On 08/27/2017 08:53 PM, Pranith Kumar wrote:
>> Using heaptrack, I found that quite a few of our temporary allocations
>> are coming from allocating work items. Instead of doing this
>> continously, we can cache the allocated items and reuse them instead
>> of freeing them.
>>
>> This reduces the number of allocations by 25% (200000 -> 150000 for
>> ARM64 boot+shutdown test).
>>
>> Signed-off-by: Pranith Kumar <address@hidden>
>
> Why does this list need to record a "last" element?
> It would seem a simple lifo would be sufficient.
>
> (You would also be able to manage the list via cmpxchg without a separate
lock,
> but perhaps the difference between the two isn't measurable.)
>
Yes, seems like a better design choice. Will fix in next iteration.
More recent glibc will also have an efficient per-thread allocator, and
though I haven't yet benchmarked the newer glibc malloc, GSlice is slower
than at least both tcmalloc and jemalloc. Perhaps you could instead make
work items statically allocated?
Thanks,
Paolo
Thanks,
--
Pranith
Re: [Qemu-devel] [PATCH 1/3] target/arm: Remove stale comment, Richard Henderson, 2017/08/28