qemu-trivial
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH] hw/arm/acpi: Pack the SRAT processors structure by node_id a


From: Zengtao (B)
Subject: RE: [PATCH] hw/arm/acpi: Pack the SRAT processors structure by node_id ascending order
Date: Fri, 10 Jan 2020 02:56:35 +0000

> -----Original Message-----
> From: Igor Mammedov [mailto:address@hidden]
> Sent: Thursday, January 09, 2020 5:54 PM
> To: Zengtao (B)
> Cc: Michael S. Tsirkin; address@hidden; address@hidden;
> Shannon Zhao; Peter Maydell; address@hidden
> Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors structure by
> node_id ascending order
> 
> On Thu, 9 Jan 2020 02:45:58 +0000
> "Zengtao (B)" <address@hidden> wrote:
> 
> > > -----Original Message-----
> > > From: Igor Mammedov [mailto:address@hidden]
> > > Sent: Thursday, January 09, 2020 12:39 AM
> > > To: Zengtao (B)
> > > Cc: Michael S. Tsirkin; address@hidden;
> address@hidden;
> > > Shannon Zhao; Peter Maydell; address@hidden
> > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors structure
> by
> > > node_id ascending order
> > >
> > > On Wed, 8 Jan 2020 04:02:10 +0000
> > > "Zengtao (B)" <address@hidden> wrote:
> > >
> > > > > -----Original Message-----
> > > > > From: Igor Mammedov [mailto:address@hidden]
> > > > > Sent: Tuesday, January 07, 2020 11:50 PM
> > > > > To: Zengtao (B)
> > > > > Cc: Michael S. Tsirkin; address@hidden;
> > > address@hidden;
> > > > > Shannon Zhao; Peter Maydell; address@hidden
> > > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors
> structure
> > > by
> > > > > node_id ascending order
> > > > >
> > > > > On Tue, 7 Jan 2020 10:29:22 +0000
> > > > > "Zengtao (B)" <address@hidden> wrote:
> > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Michael S. Tsirkin [mailto:address@hidden]
> > > > > > > Sent: Tuesday, January 07, 2020 5:33 PM
> > > > > > > To: Zengtao (B)
> > > > > > > Cc: address@hidden; address@hidden;
> Shannon
> > > > > Zhao;
> > > > > > > Peter Maydell; Igor Mammedov; address@hidden
> > > > > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors
> > > structure
> > > > > by
> > > > > > > node_id ascending order
> > > > > > >
> > > > > > > On Tue, Jan 07, 2020 at 05:18:49PM +0800, Zeng Tao wrote:
> > > > > > > > When booting the guest linux with the following numa
> > > configuration:
> > > > > > > > -numa node,node_id=1,cpus=0-3
> > > > > > > > -numa node,node_id=0,cpus=4-7
> > > > > > > > We can get the following numa topology in the guest system:
> > > > > > > > Architecture:          aarch64
> > > > > > > > Byte Order:            Little Endian
> > > > > > > > CPU(s):                8
> > > > > > > > On-line CPU(s) list:   0-7
> > > > > > > > Thread(s) per core:    1
> > > > > > > > Core(s) per socket:    8
> > > > > > > > Socket(s):             1
> > > > > > > > NUMA node(s):          2
> > > > > > > > L1d cache:             unknown size
> > > > > > > > L1i cache:             unknown size
> > > > > > > > L2 cache:              unknown size
> > > > > > > > NUMA node0 CPU(s):     0-3
> > > > > > > > NUMA node1 CPU(s):     4-7
> > > > > > > > The Cpus 0-3 is assigned with NUMA node 1 in QEMU while it
> get
> > > > > NUMA
> > > > > > > node
> > > > > > > > 0 in the guest.
> > > > > > > >
> > > > > > > > In fact, In the linux kernel, numa_node_id is allocated per the
> > > ACPI
> > > > > > > > SRAT processors structure order,so the cpu 0 will be the first
> one
> > > to
> > > > > > > > allocate its NUMA node id, so it gets the NUMA node 0.
> > > > > > > >
> > > > > > > > To fix this issue, we pack the SRAT processors structure in
> numa
> > > node
> > > > > id
> > > > > > > > order but not the default cpu number order.
> > > > > > > >
> > > > > > > > Signed-off-by: Zeng Tao <address@hidden>
> > > > > > >
> > > > > > >
> > > > > > > Does this matter? If yes fixing linux to take node id from
> proximity
> > > > > > > field in ACPI seems cleaner ...
> > > > > > >
> > > > > >
> > > > > > In fact, I just want to align the node_id concept in QEMU and
> Linux.
> > > > > > If we fix the kernel side, we need to align with all platforms.
> > > > > > i think maybe not a good idea.
> > > > > if linux makes up node ID's on it's own, it would be hard for it to
> > > > > map SRAT entries to other tables that use proximity id as well.
> > > > >
> > > > > So it would need to maintain map of [proximity id] -> [node id] (and
> > > reverse)
> > > > > somewhere to resolve mappings on other tables.
> > > > > If it doesn't do this then it's broken and works just by accident,
> > > > > if it does the fix probably should be in that code and not in QEMU.
> > > > >
> > > >
> > > > Hmm, the problem is how to understand the concept of node id.
> > > > 1. In dts, there is node id. Both the QEMU and Linux can use it
> > > > directly, so no conflict.
> > > > 2. In ACPI, there is only proximity domain, but no node id, there
> > > >  should be a mapping between them, while kernel and QEMU
> maintain
> > > >  their own rule, and unfortunately they conflict with each other
> > > >  sometimes.
> > > >
> > > > There is no specification to indicate what we should do to maintain
> the
> > > > mapping, so it's hard to align the QEMU and Linux.
> > > >
> > > > Any suggestion, or we just accept it as a rule since it don't affect
> much?
> > >
> > > If node id generation is driven by SRAT content, it might be reasonable
> to
> > > ask for SRAT parser in kernel to create node ids using proximity value
> > > instead of the order ACPI_SRAT_PROCESSOR_GICC structures are
> > > enumerated.
> > > That way node id would match ACPI spec.
> > >
> >
> > I don't quite understand "That way node id would match ACPI spec."
> > I check the ACPI 6.3 spec, I didn't find any description that node id should
> > be equal to proximity value, in section 6.2.15, there is indeed an example
> > which node numbers equals to proximity value.
> 
> There is only proximity id in spec (which conceptually is the same as node
> id).
> What I'm saying is that linux kernel ACPI parser could use it as node id
> instead of using counter for making up node ids.
> 
> (issue with it is that proximity values could be sparse, so one
> would need to audit current node id users to make sure that using
> proximity id won't break anything)
> 
I think that is really a problem, in linux kernel:
#if MAX_NUMNODEs > 256
#define MAX_PXM_DOMAINS MAX_NUMNODEs
#else
#define MAX_PXM_DOMAINS 256
#endif
It seems that using counter is a better choice. 

> > Thanks
> >
> > > But even with that I'd wouldn't expect cpu ids match as its basically
> > > arbitrary numbers on both sided. One would need to use arch specific
> ids
> > > to reliably match cpus on both sides (MPIDR in ARM case or APICID in
> x86).
> > >.
> > > > > >
> > > > > > > > ---
> > > > > > > >  hw/arm/virt-acpi-build.c | 23 +++++++++++++++--------
> > > > > > > >  1 file changed, 15 insertions(+), 8 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > > > > > > > index bd5f771..497192b 100644
> > > > > > > > --- a/hw/arm/virt-acpi-build.c
> > > > > > > > +++ b/hw/arm/virt-acpi-build.c
> > > > > > > > @@ -520,7 +520,8 @@ build_srat(GArray *table_data,
> > > BIOSLinker
> > > > > > > *linker, VirtMachineState *vms)
> > > > > > > >      AcpiSystemResourceAffinityTable *srat;
> > > > > > > >      AcpiSratProcessorGiccAffinity *core;
> > > > > > > >      AcpiSratMemoryAffinity *numamem;
> > > > > > > > -    int i, srat_start;
> > > > > > > > +    int i, j, srat_start;
> > > > > > > > +    uint32_t node_id;
> > > > > > > >      uint64_t mem_base;
> > > > > > > >      MachineClass *mc = MACHINE_GET_CLASS(vms);
> > > > > > > >      MachineState *ms = MACHINE(vms);
> > > > > > > > @@ -530,13 +531,19 @@ build_srat(GArray *table_data,
> > > BIOSLinker
> > > > > > > *linker, VirtMachineState *vms)
> > > > > > > >      srat = acpi_data_push(table_data, sizeof(*srat));
> > > > > > > >      srat->reserved1 = cpu_to_le32(1);
> > > > > > > >
> > > > > > > > -    for (i = 0; i < cpu_list->len; ++i) {
> > > > > > > > -        core = acpi_data_push(table_data, sizeof(*core));
> > > > > > > > -        core->type = ACPI_SRAT_PROCESSOR_GICC;
> > > > > > > > -        core->length = sizeof(*core);
> > > > > > > > -        core->proximity =
> > > > > > > cpu_to_le32(cpu_list->cpus[i].props.node_id);
> > > > > > > > -        core->acpi_processor_uid = cpu_to_le32(i);
> > > > > > > > -        core->flags = cpu_to_le32(1);
> > > > > > > > +    for (i = 0; i < ms->numa_state->num_nodes; ++i) {
> > > > > > > > +        for (j = 0; j < cpu_list->len; ++j) {
> > > > > > >
> > > > > > > Hmm O(n ^2) isn't great ...
> > > > > > Good suggestion, 3Q.
> > > > > > >
> > > > > > > > +            node_id =
> > > > > cpu_to_le32(cpu_list->cpus[j].props.node_id);
> > > > > > > > +            if (node_id != i) {
> > > > > > > > +                continue;
> > > > > > > > +            }
> > > > > > > > +            core = acpi_data_push(table_data,
> sizeof(*core));
> > > > > > > > +            core->type = ACPI_SRAT_PROCESSOR_GICC;
> > > > > > > > +            core->length = sizeof(*core);
> > > > > > > > +            core->proximity = node_id;
> > > > > > > > +            core->acpi_processor_uid = cpu_to_le32(j);
> > > > > > > > +            core->flags = cpu_to_le32(1);
> > > > > > > > +        }
> > > > > > > >      }
> > > > > > >
> > > > > > > is the issue arm specific? wouldn't it affect x86 too?
> > > > > > >
> > > > > > Good question, I think it will affect x86, but I need to confirm.
> > > > > >
> > > > > > > >      mem_base = vms->memmap[VIRT_MEM].base;
> > > > > > > > --
> > > > > > > > 2.8.1
> > > > > >
> > > >
> >




reply via email to

[Prev in Thread] Current Thread [Next in Thread]