qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 0/4] vfio: report NUMA nodes for device memory


From: David Hildenbrand
Subject: Re: [PATCH v1 0/4] vfio: report NUMA nodes for device memory
Date: Fri, 15 Sep 2023 20:34:41 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 15.09.23 16:47, Alex Williamson wrote:
On Fri, 15 Sep 2023 16:19:29 +0200
Cédric Le Goater <clg@redhat.com> wrote:

Hello Ankit,

On 9/15/23 04:45, ankita@nvidia.com wrote:
From: Ankit Agrawal <ankita@nvidia.com>

For devices which allow CPU to cache coherently access their memory,
it is sensible to expose such memory as NUMA nodes separate from
the sysmem node. Qemu currently do not provide a mechanism for creation
of NUMA nodes associated with a vfio-pci device.

Implement a mechanism to create and associate a set of unique NUMA nodes
with a vfio-pci device.>
NUMA node is created by inserting a series of the unique proximity
domains (PXM) in the VM SRAT ACPI table. The ACPI tables are read once
at the time of bootup by the kernel to determine the NUMA configuration
and is inflexible post that. Hence this feature is incompatible with
device hotplug. The added node range associated with the device is
communicated through ACPI DSD and can be fetched by the VM kernel or
kernel modules. QEMU's VM SRAT and DSD builder code is modified
accordingly.

New command line params are introduced for admin to have a control on
the NUMA node assignment.

This approach seems to bypass the NUMA framework in place in QEMU and
will be a challenge for the upper layers. QEMU is generally used from
libvirt when dealing with KVM guests.

Typically, a command line for a virt machine with NUMA nodes would look
like :

    -object memory-backend-ram,id=ram-node0,size=1G \
    -numa node,nodeid=0,memdev=ram-node0 \
    -object memory-backend-ram,id=ram-node1,size=1G \
    -numa node,nodeid=1,cpus=0-3,memdev=ram-node1

which defines 2 nodes, one with memory and all CPUs and a second with
only memory.

    # numactl -H
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3
    node 0 size: 1003 MB
    node 0 free: 734 MB
    node 1 cpus:
    node 1 size: 975 MB
    node 1 free: 968 MB
    node distances:
    node   0   1
      0:  10  20
      1:  20  10

Could it be a new type of host memory backend ? Have you considered
this approach ?

Good idea.  Fundamentally the device should not be creating NUMA nodes,
the VM should be configured with NUMA nodes and the device memory
associated with those nodes.

+1. That would also make it fly with DIMMs and virtio-mem, where you would want NUMA-less nodes ass well (imagine passing CXL memory to a VM using virtio-mem).

--
Cheers,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]