Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscar

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscar

From:	David Hildenbrand
Subject:	Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
Date:	Mon, 26 Jul 2021 09:51:25 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 24.07.21 00:33, Peter Xu wrote:

On Fri, Jul 23, 2021 at 08:56:54PM +0200, David Hildenbrand wrote:


As I've asked this question previously elsewhere, it's more or less also
related to the design decision of having virtio-mem being able to sparsely
plugged in such a small granularity rather than making the plug/unplug still
continuous within GPA range (so we move page when unplug).


Yes, in an ideal world that would be optimal solution. Unfortunately, we're
not living in an ideal world :)

virtio-mem in Linux guests will as default try unplugging highest-to-lowest
address, and I have on my TODO list an item to shrink the usable region (->
later, shrinking the actual RAMBlock) once possible.

So virtio-mem is prepared for that, but it will only apply in some cases.


There's definitely reasons there and I believe you're the expert on that (as
you mentioned once: some guest GUPed pages cannot migrate so cannot get those
ranges offlined otherwise), but so far I still not sure whether that's a kernel
issue to solve on GUP, although I agree it's a complicated one anyway!


To do something like that reliably, you have to manage hotplugged memory in
a special way, for example, in a movable zone.

We have a at least 4 cases:

a) The guest OS supports the movable zone and uses it for all hotplugged
    memory
b) The guest OS supports the movable zone and uses it for some
    hotplugged memory
c) The guest OS supports the movable zone and uses it for no hotplugged
    memory
d) The guest OS does not support the concept of movable zones


a) is the dream but only applies in some cases if Linux is properly
configured (e.g., never hotplug more than 3 times boot memory)
b) will be possible under Linux soon (e.g., when hotplugging more than 3
times boot memory)
c) is the default under Linux for most Linux distributions
d) Is Windows

In addition, we can still have random unplug errors when using the movable
zone, for example, if someone references a page just a little too long.

Maybe that helps.


Yes, thanks.


Maybe it's a trade-off you made at last, I don't have enough knowledge to tell.


That's the precise description of what virtio-mem is. It's a trade-off
between which OSs we want to support, what the guest OS can actually do, how
we can manage memory in the hypervisor efficiently, ...


The patch itself looks okay to me, there's just a slight worry on not sure how
long would the list be at last; if it's chopped in 1M/2M small chunks.


I don't think that's really an issue: take a look at
qemu_get_guest_memory_mapping(), which will create as many entries as
necessary to express the guest physical mapping of the guest virtual (!)
address space with such chunks. That can be a lot :)


I'm indeed a bit surprised by the "paging" parameter.. I gave it a try, the
list grows into tens of thousands.

Yes, and the bigger the VM, the more entries you should get ... likewith virtio-mem.


One last question: will virtio-mem still do best-effort to move the pages, so
as to grant as less holes as possible?


That depends on the guest OS.

Linux guests will unplug highest-to-lowest addresses. They will trymigrating pages away (alloc_contig_range()) to minimize fragmentation.Further, when (un)plugging, they will try a) unplug within alreadyfragmented Linux memory blocks (e.g., 128 MiB) b) plugging withinalready fragmented Linux memory blocks first. Because the goal is torequire as little as possible Linux memory blocks to reduce metadata(memmap) overhead.

I recall that the Windows prototype also tries unplug highest-to-lowestusing the Windows range allocator, however, I have no idea what thatrange allcoator actually does (if it only grabs free pages or if it canactually move around busy pages).

For Linux guests, there is a work item to continue defragmenting thelayout to free up complete Linux memory blocks over time.

With a 1 TiB virtio-mem device and a 2 MiB block size (default), in theworst case we would get 262144 individual blocks (every second oneplugged). While this is far from realistic, I assume we can getsomething comparable when dumping a huge VM in paging mode.

With 262144 entires, with ~48 byte (6*8 byte) per element, we'd consume12 MiB for the whole list. Not perfect, but not too bad.


--
Thanks,

David / dhildenb

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions, (continued)
- [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions, David Hildenbrand, 2021/07/20
  - Re: [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions, Stefan Berger, 2021/07/20
  - Re: [PATCH resend v2 3/5] softmmu/memory_mapping: never merge ranges accross memory regions, Peter Xu, 2021/07/23
- [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges, David Hildenbrand, 2021/07/20
  - Re: [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges, Stefan Berger, 2021/07/20
  - Re: [PATCH resend v2 4/5] softmmu/memory_mapping: factor out adding physical memory ranges, Peter Xu, 2021/07/23
- [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections, David Hildenbrand, 2021/07/20
  - Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections, Peter Xu, 2021/07/23
    - Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections, David Hildenbrand, 2021/07/23
    - Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections, Peter Xu, 2021/07/23
    - Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections, David Hildenbrand <=
  - Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections, Peter Xu, 2021/07/26

Prev by Date: Re: [PATCH 01/20] Hexagon HVX (target/hexagon) README
Next by Date: Re: [PATCH] hw/intc/arm_gic: Fix set/clear pending of PPI/SPI
Previous by thread: Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
Next by thread: Re: [PATCH resend v2 5/5] softmmu/memory_mapping: optimize for RamDiscardManager sections
Index(es):
- Date
- Thread