qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 0/2] acpi: report numa nodes for device memory using GI


From: Jonathan Cameron
Subject: Re: [PATCH v6 0/2] acpi: report numa nodes for device memory using GI
Date: Tue, 2 Jan 2024 12:31:43 +0000

On Mon, 25 Dec 2023 10:26:01 +0530
<ankita@nvidia.com> wrote:

> From: Ankit Agrawal <ankita@nvidia.com>
> 
> There are upcoming devices which allow CPU to cache coherently access
> their memory. It is sensible to expose such memory as NUMA nodes separate
> from the sysmem node to the OS. The ACPI spec provides a scheme in SRAT
> called Generic Initiator Affinity Structure [1] to allow an association
> between a Proximity Domain (PXM) and a Generic Initiator (GI) (e.g.
> heterogeneous processors and accelerators, GPUs, and I/O devices with
> integrated compute or DMA engines).
> 
> While a single node per device may cover several use cases, it is however
> insufficient for a full utilization of the NVIDIA GPUs MIG
> (Mult-Instance GPUs) [2] feature. The feature allows partitioning of the
> GPU device resources (including device memory) into several (upto 8)
> isolated instances. Each of the partitioned memory requires a dedicated NUMA
> node to operate. The partitions are not fixed and they can be created/deleted
> at runtime.
> 
> Linux OS does not provide a means to dynamically create/destroy NUMA nodes
> and such feature implementation is expected to be non-trivial. The nodes
> that OS discovers at the boot time while parsing SRAT remains fixed. So we
> utilize the GI Affinity structures that allows association between nodes
> and devices. Multiple GI structures per device/BDF is possible, allowing
> creation of multiple nodes in the VM by exposing unique PXM in each of these
> structures.
> 
> Implement the mechanism to build the GI affinity structures as Qemu currently
> does not. Introduce a new acpi-generic-initiator object that allows an
> association of a set of nodes with a device. During SRAT creation, all such
> objected are identified and used to add the GI Affinity Structures. Currently,
> only PCI device is supported. On a multi device system, each device supporting
> the features needs a unique acpi-generic-initiator object with its own set of
> NUMA nodes associated to it.
> 
> The admin will create a range of 8 nodes and associate that with the device
> using the acpi-generic-initiator object. While a configuration of less than
> 8 nodes per device is allowed, such configuration will prevent utilization of
> the feature to the fullest. This setting is applicable to all the Grace+Hopper
> systems. The following is an example of the Qemu command line arguments to
> create 8 nodes and link them to the device 'dev0':
> 
> -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
> -numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
> -numa node,nodeid=8 -numa node,nodeid=9 \
> -device 
> vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
> -object acpi-generic-initiator,id=gi0,pci-dev=dev0,host-nodes=2-9 \
> 

I'd find it helpful to see the resulting chunk of SRAT for these examples
(disassembled) in this cover letter and the patches (where there are more 
examples).




reply via email to

[Prev in Thread] Current Thread [Next in Thread]