[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage |
Date: |
Tue, 28 Jun 2016 08:29:22 -0400 (EDT) |
> Am 28.06.2016 um 13:37 schrieb Paolo Bonzini:
> > On 28/06/2016 11:01, Peter Lieven wrote:
> >> I recently found that Qemu is using several hundred megabytes of RSS
> >> memory
> >> more than older versions such as Qemu 2.2.0. So I started tracing
> >> memory allocation and found 2 major reasons for this.
> >>
> >> 1) We changed the qemu coroutine pool to have a per thread and a global
> >> release
> >> pool. The choosen poolsize and the changed algorithm could lead to up
> >> to
> >> 192 free coroutines with just a single iothread. Each of the
> >> coroutines
> >> in the pool each having 1MB of stack memory.
> > But the fix, as you correctly note, is to reduce the stack size. It
> > would be nice to compile block-obj-y with -Wstack-usage=2048 too.
>
> To reveal if there are any big stack allocations in the block layer?
Yes. Most should be fixed by now, but a handful are probably still there.
(definitely one in vvfat.c).
> As it seems reducing to 64kB breaks live migration in some (non reproducible)
> cases.
Does it hit the guard page?
> >> 2) Between Qemu 2.2.0 and 2.3.0 RCU was introduced which lead to delayed
> >> freeing
> >> of memory. This lead to higher heap allocations which could not
> >> effectively
> >> be returned to kernel (most likely due to fragmentation).
> > I agree that some of the exec.c allocations need some care, but I would
> > prefer to use a custom free list or lazy allocation instead of mmap.
>
> This would only help if the elements from the free list would be allocated
> using mmap? The issue is that RCU delays the freeing so that the number of
> concurrent allocations is high and then a bunch is freed at once. If the
> memory
> was malloced it would still have caused trouble.
The free list should improve reuse and fragmentation. I'll take a look at
lazy allocation of subpages, too.
Paolo
- Re: [Qemu-devel] [PATCH 07/15] qapi: use mmap for QmpInputVisitor, (continued)
- [Qemu-devel] [PATCH 10/15] vmware_svga: use mmap for scratch pad, Peter Lieven, 2016/06/28
- [Qemu-devel] [PATCH 04/15] coroutine: add a knob to disable the shared release pool, Peter Lieven, 2016/06/28
- [Qemu-devel] [PATCH 05/15] util: add a helper to mmap private anonymous memory, Peter Lieven, 2016/06/28
- [Qemu-devel] [PATCH 06/15] exec: use mmap for subpages, Peter Lieven, 2016/06/28
- Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage, Paolo Bonzini, 2016/06/28
- Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage, Peter Lieven, 2016/06/28
- Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage,
Paolo Bonzini <=
- Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage, Peter Lieven, 2016/06/28
- Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage, Paolo Bonzini, 2016/06/28
- Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage, Dr. David Alan Gilbert, 2016/06/28
- Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage, Peter Lieven, 2016/06/28
- Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage, Peter Lieven, 2016/06/28