qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 0/4] vfio: report NUMA nodes for device memory


From: David Hildenbrand
Subject: Re: [PATCH v1 0/4] vfio: report NUMA nodes for device memory
Date: Fri, 22 Sep 2023 10:15:21 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 22.09.23 10:11, Ankit Agrawal wrote:

Typically, a command line for a virt machine with NUMA nodes would
look like :

     -object memory-backend-ram,id=ram-node0,size=1G \
     -numa node,nodeid=0,memdev=ram-node0 \
     -object memory-backend-ram,id=ram-node1,size=1G \
     -numa node,nodeid=1,cpus=0-3,memdev=ram-node1

which defines 2 nodes, one with memory and all CPUs and a second with
only memory.

     # numactl -H
     available: 2 nodes (0-1)
     node 0 cpus: 0 1 2 3
     node 0 size: 1003 MB
     node 0 free: 734 MB
     node 1 cpus:
     node 1 size: 975 MB
     node 1 free: 968 MB
     node distances:
     node   0   1
       0:  10  20
       1:  20  10


Could it be a new type of host memory backend ?  Have you considered
this approach ?

Good idea.  Fundamentally the device should not be creating NUMA
nodes, the VM should be configured with NUMA nodes and the device
memory associated with those nodes.

+1. That would also make it fly with DIMMs and virtio-mem, where you
would want NUMA-less nodes ass well (imagine passing CXL memory to a VM
using virtio-mem).


We actually do not add the device memory on the host, instead
map it into the Qemu VMA using remap_pfn_range(). Please checkout the
mmap function in vfio-pci variant driver code managing the device.
https://lore.kernel.org/all/20230915025415.6762-1-ankita@nvidia.com/
And I think host memory backend would need memory that is added on the
host.

Moreover since we want to passthrough the entire device memory, the
-object memory-backend-ram would have to be passed a size that is equal
to the device memory. I wonder if that would be too much of a trouble
for an admin (or libvirt) triggering the Qemu process.

Both these items are avoided by exposing the device memory as BAR as in the
current  implementation (referenced above) since it lets Qemu to naturally
discover the device memory region and do mmap.


Just to clarify: nNUMA nodes for DIMMs/NVDIMMs/virtio-mem are configured on the device, not on the memory backend.

e.g., -device pc-dimm,node=3,memdev=mem1,...

--
Cheers,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]