qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC 00/15] virtio-mem: Expose device memory via separate mems


From: Dr. David Alan Gilbert
Subject: Re: [PATCH RFC 00/15] virtio-mem: Expose device memory via separate memslots
Date: Wed, 13 Oct 2021 20:03:35 +0100
User-agent: Mutt/2.0.7 (2021-05-04)

* David Hildenbrand (david@redhat.com) wrote:
> Based-on: 20211011175346.15499-1-david@redhat.com
> 
> A virtio-mem device is represented by a single large RAM memory region
> backed by a single large mmap.
> 
> Right now, we map that complete memory region into guest physical addres
> space, resulting in a very large memory mapping, KVM memory slot, ...
> although only a small amount of memory might actually be exposed to the VM.
> 
> For example, when starting a VM with a 1 TiB virtio-mem device that only
> exposes little device memory (e.g., 1 GiB) towards the VM initialliy,
> in order to hotplug more memory later, we waste a lot of memory on metadata
> for KVM memory slots (> 2 GiB!) and accompanied bitmaps. Although some
> optimizations in KVM are being worked on to reduce this metadata overhead
> on x86-64 in some cases, it remains a problem with nested VMs and there are
> other reasons why we would want to reduce the total memory slot to a
> reasonable minimum.
> 
> We want to:
> a) Reduce the metadata overhead, including bitmap sizes inside KVM but also
>    inside QEMU KVM code where possible.
> b) Not always expose all device-memory to the VM, to reduce the attack
>    surface of malicious VMs without using userfaultfd.
> 
> So instead, expose the RAM memory region not by a single large mapping
> (consuming one memslot) but instead by multiple mappings, each consuming
> one memslot. To do that, we divide the RAM memory region via aliases into
> separate parts and only map the aliases into a device container we actually
> need. We have to make sure that QEMU won't silently merge the memory
> sections corresponding to the aliases (and thereby also memslots),
> otherwise we lose atomic updates with KVM and vhost-user, which we deeply
> care about when adding/removing memory. Further, to get memslot accounting
> right, such merging is better avoided.
> 
> Within the memslots, virtio-mem can (un)plug memory in smaller granularity
> dynamically. So memslots are a pure optimization to tackle a) and b) above.
> 
> Memslots are right now mapped once they fall into the usable device region
> (which grows/shrinks on demand right now either when requesting to
>  hotplug more memory or during/after reboots). In the future, with
> VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we'll be able to (un)map aliases even
> more dynamically when (un)plugging device blocks.
> 
> 
> Adding a 500GiB virtio-mem device and not hotplugging any memory results in:
>     0000000140000000-000001047fffffff (prio 0, i/o): device-memory
>       0000000140000000-0000007e3fffffff (prio 0, i/o): virtio-mem-memslots
> 
> Requesting the VM to consume 2 GiB results in (note: the usable region size
> is bigger than 2 GiB, so 3 * 1 GiB memslots are required):
>     0000000140000000-000001047fffffff (prio 0, i/o): device-memory
>       0000000140000000-0000007e3fffffff (prio 0, i/o): virtio-mem-memslots
>         0000000140000000-000000017fffffff (prio 0, ram): alias 
> virtio-mem-memslot-0 @mem0 0000000000000000-000000003fffffff
>         0000000180000000-00000001bfffffff (prio 0, ram): alias 
> virtio-mem-memslot-1 @mem0 0000000040000000-000000007fffffff
>         00000001c0000000-00000001ffffffff (prio 0, ram): alias 
> virtio-mem-memslot-2 @mem0 0000000080000000-00000000bfffffff

I've got a vague memory that there were some devices that didn't like
doing split IO across a memory region (or something) - some virtio
devices?  Do you know if that's still true and if that causes a problem?

Dave

> Requesting the VM to consume 20 GiB results in:
>     0000000140000000-000001047fffffff (prio 0, i/o): device-memory
>       0000000140000000-0000007e3fffffff (prio 0, i/o): virtio-mem-memslots
>         0000000140000000-000000017fffffff (prio 0, ram): alias 
> virtio-mem-memslot-0 @mem0 0000000000000000-000000003fffffff
>         0000000180000000-00000001bfffffff (prio 0, ram): alias 
> virtio-mem-memslot-1 @mem0 0000000040000000-000000007fffffff
>         00000001c0000000-00000001ffffffff (prio 0, ram): alias 
> virtio-mem-memslot-2 @mem0 0000000080000000-00000000bfffffff
>         0000000200000000-000000023fffffff (prio 0, ram): alias 
> virtio-mem-memslot-3 @mem0 00000000c0000000-00000000ffffffff
>         0000000240000000-000000027fffffff (prio 0, ram): alias 
> virtio-mem-memslot-4 @mem0 0000000100000000-000000013fffffff
>         0000000280000000-00000002bfffffff (prio 0, ram): alias 
> virtio-mem-memslot-5 @mem0 0000000140000000-000000017fffffff
>         00000002c0000000-00000002ffffffff (prio 0, ram): alias 
> virtio-mem-memslot-6 @mem0 0000000180000000-00000001bfffffff
>         0000000300000000-000000033fffffff (prio 0, ram): alias 
> virtio-mem-memslot-7 @mem0 00000001c0000000-00000001ffffffff
>         0000000340000000-000000037fffffff (prio 0, ram): alias 
> virtio-mem-memslot-8 @mem0 0000000200000000-000000023fffffff
>         0000000380000000-00000003bfffffff (prio 0, ram): alias 
> virtio-mem-memslot-9 @mem0 0000000240000000-000000027fffffff
>         00000003c0000000-00000003ffffffff (prio 0, ram): alias 
> virtio-mem-memslot-10 @mem0 0000000280000000-00000002bfffffff
>         0000000400000000-000000043fffffff (prio 0, ram): alias 
> virtio-mem-memslot-11 @mem0 00000002c0000000-00000002ffffffff
>         0000000440000000-000000047fffffff (prio 0, ram): alias 
> virtio-mem-memslot-12 @mem0 0000000300000000-000000033fffffff
>         0000000480000000-00000004bfffffff (prio 0, ram): alias 
> virtio-mem-memslot-13 @mem0 0000000340000000-000000037fffffff
>         00000004c0000000-00000004ffffffff (prio 0, ram): alias 
> virtio-mem-memslot-14 @mem0 0000000380000000-00000003bfffffff
>         0000000500000000-000000053fffffff (prio 0, ram): alias 
> virtio-mem-memslot-15 @mem0 00000003c0000000-00000003ffffffff
>         0000000540000000-000000057fffffff (prio 0, ram): alias 
> virtio-mem-memslot-16 @mem0 0000000400000000-000000043fffffff
>         0000000580000000-00000005bfffffff (prio 0, ram): alias 
> virtio-mem-memslot-17 @mem0 0000000440000000-000000047fffffff
>         00000005c0000000-00000005ffffffff (prio 0, ram): alias 
> virtio-mem-memslot-18 @mem0 0000000480000000-00000004bfffffff
>         0000000600000000-000000063fffffff (prio 0, ram): alias 
> virtio-mem-memslot-19 @mem0 00000004c0000000-00000004ffffffff
>         0000000640000000-000000067fffffff (prio 0, ram): alias 
> virtio-mem-memslot-20 @mem0 0000000500000000-000000053fffffff
> 
> Requesting the VM to consume 5 GiB and rebooting (note: usable region size
> will change during reboots) results in:
>     0000000140000000-000001047fffffff (prio 0, i/o): device-memory
>       0000000140000000-0000007e3fffffff (prio 0, i/o): virtio-mem-memslots
>         0000000140000000-000000017fffffff (prio 0, ram): alias 
> virtio-mem-memslot-0 @mem0 0000000000000000-000000003fffffff
>         0000000180000000-00000001bfffffff (prio 0, ram): alias 
> virtio-mem-memslot-1 @mem0 0000000040000000-000000007fffffff
>         00000001c0000000-00000001ffffffff (prio 0, ram): alias 
> virtio-mem-memslot-2 @mem0 0000000080000000-00000000bfffffff
>         0000000200000000-000000023fffffff (prio 0, ram): alias 
> virtio-mem-memslot-3 @mem0 00000000c0000000-00000000ffffffff
>         0000000240000000-000000027fffffff (prio 0, ram): alias 
> virtio-mem-memslot-4 @mem0 0000000100000000-000000013fffffff
>         0000000280000000-00000002bfffffff (prio 0, ram): alias 
> virtio-mem-memslot-5 @mem0 0000000140000000-000000017fffffff
> 
> 
> In addition to other factors, we limit the number of memslots to 1024 per
> devices and the size of one memslot to at least 1 GiB. So only a 1 TiB
> virtio-mem device could consume 1024 memslots in the "worst" case. To
> calculate a memslot limit for a device, we use a heuristic based on all
> available memslots for memory devices and the percentage of
> "device size":"total memory device area size". Users can further limit
> the maximum number of memslots that will be used by a device by setting
> the "max-memslots" property. It's expected to be set to "0" (auto) in most
> setups.
> 
> In recent setups (e.g., KVM with ~32k memslots, vhost-user with ~4k
> memslots after this series), we'll get the biggest benefit. In special
> setups (e.g., older KVM, vhost kernel with 64 memslots), we'll get some
> benefit -- the individual memslots will be bigger.
> 
> Future work:
> - vhost-user and libvhost-user optimizations for handling more memslots
>   more efficiently.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Ani Sinha <ani@anisinha.ca>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Richard Henderson <richard.henderson@linaro.org>
> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
> Cc: kvm@vger.kernel.org
> 
> David Hildenbrand (15):
>   memory: Drop mapping check from
>     memory_region_get_ram_discard_manager()
>   kvm: Return number of free memslots
>   vhost: Return number of free memslots
>   memory: Allow for marking memory region aliases unmergeable
>   vhost: Don't merge unmergeable memory sections
>   memory-device: Move memory_device_check_addable() directly into
>     memory_device_pre_plug()
>   memory-device: Generalize memory_device_used_region_size()
>   memory-device: Support memory devices that consume a variable number
>     of memslots
>   vhost: Respect reserved memslots for memory devices when realizing a
>     vhost device
>   virtio-mem: Set the RamDiscardManager for the RAM memory region
>     earlier
>   virtio-mem: Fix typo in virito_mem_intersect_memory_section() function
>     name
>   virtio-mem: Expose device memory via separate memslots
>   vhost-user: Increase VHOST_USER_MAX_RAM_SLOTS to 496 with
>     CONFIG_VIRTIO_MEM
>   libvhost-user: Increase VHOST_USER_MAX_RAM_SLOTS to 4096
>   virtio-mem: Set "max-memslots" to 0 (auto) for the 6.2 machine
> 
>  accel/kvm/kvm-all.c                       |  24 ++-
>  accel/stubs/kvm-stub.c                    |   4 +-
>  hw/core/machine.c                         |   1 +
>  hw/mem/memory-device.c                    | 167 +++++++++++++++---
>  hw/virtio/vhost-stub.c                    |   2 +-
>  hw/virtio/vhost-user.c                    |   7 +-
>  hw/virtio/vhost.c                         |  17 +-
>  hw/virtio/virtio-mem-pci.c                |  22 +++
>  hw/virtio/virtio-mem.c                    | 202 ++++++++++++++++++++--
>  include/exec/memory.h                     |  23 +++
>  include/hw/mem/memory-device.h            |  32 ++++
>  include/hw/virtio/vhost.h                 |   2 +-
>  include/hw/virtio/virtio-mem.h            |  29 +++-
>  include/sysemu/kvm.h                      |   2 +-
>  softmmu/memory.c                          |  35 +++-
>  stubs/qmp_memory_device.c                 |   5 +
>  subprojects/libvhost-user/libvhost-user.h |   7 +-
>  17 files changed, 499 insertions(+), 82 deletions(-)
> 
> -- 
> 2.31.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK




reply via email to

[Prev in Thread] Current Thread [Next in Thread]