qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [Qemu-devel] [PATCH for 3.1] spapr: Fix ibm, max-associa


From: Greg Kurz
Subject: Re: [Qemu-ppc] [Qemu-devel] [PATCH for 3.1] spapr: Fix ibm, max-associativity-domains property number of nodes
Date: Wed, 21 Nov 2018 14:58:18 +0100

On Tue, 20 Nov 2018 20:58:45 +0200
Serhii Popovych <address@hidden> wrote:

> Greg Kurz wrote:
> > On Mon, 19 Nov 2018 14:48:34 +0100
> > Laurent Vivier <address@hidden> wrote:
> >   
> >> On 19/11/2018 14:27, Greg Kurz wrote:  
> >>> On Mon, 19 Nov 2018 08:09:38 -0500
> >>> Serhii Popovych <address@hidden> wrote:
> >>>     
> >>>> Laurent Vivier reported off by one with maximum number of NUMA nodes
> >>>> provided by qemu-kvm being less by one than required according to
> >>>> description of "ibm,max-associativity-domains" property in LoPAPR.
> >>>>
> >>>> It appears that I incorrectly treated LoPAPR description of this
> >>>> property assuming it provides last valid domain (NUMA node here)
> >>>> instead of maximum number of domains.
> >>>>
> >>>>   ### Before hot-add
> >>>>
> >>>>   (qemu) info numa
> >>>>   3 nodes
> >>>>   node 0 cpus: 0
> >>>>   node 0 size: 0 MB
> >>>>   node 0 plugged: 0 MB
> >>>>   node 1 cpus:
> >>>>   node 1 size: 1024 MB
> >>>>   node 1 plugged: 0 MB
> >>>>   node 2 cpus:
> >>>>   node 2 size: 0 MB
> >>>>   node 2 plugged: 0 MB
> >>>>
> >>>>   $ numactl -H
> >>>>   available: 2 nodes (0-1)
> >>>>   node 0 cpus: 0
> >>>>   node 0 size: 0 MB
> >>>>   node 0 free: 0 MB
> >>>>   node 1 cpus:
> >>>>   node 1 size: 999 MB
> >>>>   node 1 free: 658 MB
> >>>>   node distances:
> >>>>   node   0   1
> >>>>     0:  10  40
> >>>>     1:  40  10
> >>>>
> >>>>   ### Hot-add
> >>>>
> >>>>   (qemu) object_add memory-backend-ram,id=mem0,size=1G
> >>>>   (qemu) device_add pc-dimm,id=dimm1,memdev=mem0,node=2
> >>>>   (qemu) [   87.704898] pseries-hotplug-mem: Attempting to hot-add 4 ...
> >>>>   <there is no "Initmem setup node 2 [mem 0xHEX-0xHEX]">
> >>>>   [   87.705128] lpar: Attempting to resize HPT to shift 21
> >>>>   ... <HPT resize messages>
> >>>>
> >>>>   ### After hot-add
> >>>>
> >>>>   (qemu) info numa
> >>>>   3 nodes
> >>>>   node 0 cpus: 0
> >>>>   node 0 size: 0 MB
> >>>>   node 0 plugged: 0 MB
> >>>>   node 1 cpus:
> >>>>   node 1 size: 1024 MB
> >>>>   node 1 plugged: 0 MB
> >>>>   node 2 cpus:
> >>>>   node 2 size: 1024 MB
> >>>>   node 2 plugged: 1024 MB
> >>>>
> >>>>   $ numactl -H
> >>>>   available: 2 nodes (0-1)
> >>>>   ^^^^^^^^^^^^^^^^^^^^^^^^
> >>>>              Still only two nodes (and memory hot-added to node 0 below)
> >>>>   node 0 cpus: 0
> >>>>   node 0 size: 1024 MB
> >>>>   node 0 free: 1021 MB
> >>>>   node 1 cpus:
> >>>>   node 1 size: 999 MB
> >>>>   node 1 free: 658 MB
> >>>>   node distances:
> >>>>   node   0   1
> >>>>     0:  10  40
> >>>>     1:  40  10
> >>>>
> >>>> After fix applied numactl(8) reports 3 nodes available and memory
> >>>> plugged into node 2 as expected.
> >>>>
> >>>> Fixes: da9f80fbad21 ("spapr: Add ibm,max-associativity-domains property")
> >>>> Reported-by: Laurent Vivier <address@hidden>
> >>>> Signed-off-by: Serhii Popovych <address@hidden>
> >>>> ---
> >>>>  hw/ppc/spapr.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>> index 7afd1a1..843ae6c 100644
> >>>> --- a/hw/ppc/spapr.c
> >>>> +++ b/hw/ppc/spapr.c
> >>>> @@ -1033,7 +1033,7 @@ static void spapr_dt_rtas(sPAPRMachineState 
> >>>> *spapr, void *fdt)
> >>>>          cpu_to_be32(0),
> >>>>          cpu_to_be32(0),
> >>>>          cpu_to_be32(0),
> >>>> -        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes - 1 : 0),
> >>>> +        cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 0),    
> >>>
> >>> Maybe simply cpu_to_be32(nb_numa_nodes) ?    
> >>
> >> Or "cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1)" ?  
> 
> Linux handles zero correctly, but nb_numa_nodes ?: 1 looks better.
> 
> I did testing with just cpu_to_be32(nb_numa_nodes) and
> cpu_to_be32(nb_numa_nodes ? nb_numa_nodes : 1) it works with Linux
> correctly in both cases
> 
> (guest)# numactl -H
> available: 1 nodes (0)
> node 0 cpus: 0
> node 0 size: 487 MB
> node 0 free: 148 MB
> node distances:
> node   0
>   0:  10
> 
> (qemu) info numa
> 0 nodes
> 
> >>
> >> In spapr_populate_drconf_memory() we have this logic.
> >>  
> > 
> > Hmm... maybe you're right, it seems that the code assumes
> > non-NUMA configs have at one node. Similar assumption is
> > also present in pc_dimm_realize():
> > 
> >     if (((nb_numa_nodes > 0) && (dimm->node >= nb_numa_nodes)) ||
> >         (!nb_numa_nodes && dimm->node))   
> According to this nb_numa_nodes can be zero
> 
> >         error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %"
> >                    PRIu32 "' which exceeds the number of numa nodes: %d",
> >                    dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);  
> and this just handles this case to show proper error message.
> 

Indeed but it doesn't really explain why we're doing this...

> >         return;
> >     }  
> 
> > 
> > This is a bit confusing...

... fortunately, these commits shed some light:

commit 7db8a127e373e468d1f61e46e01e50d1aa33e827
Author: Alexey Kardashevskiy <address@hidden>
Date:   Thu Jul 3 13:10:04 2014 +1000

    spapr: Refactor spapr_populate_memory() to allow memoryless nodes
    
    Current QEMU does not support memoryless NUMA nodes, however
    actual hardware may have them so it makes sense to have a way
    to emulate them in QEMU. This prepares SPAPR for that.
    
    This moves 2 calls of spapr_populate_memory_node() into
    the existing loop over numa nodes so first several nodes may
    have no memory and this still will work.
    
    If there is no numa configuration, the code assumes there is just
    a single node at 0 and it has all the guest memory.
    
    Signed-off-by: Alexey Kardashevskiy <address@hidden>
    Signed-off-by: Alexander Graf <address@hidden>


commit 6663864e950d40c467ae4ab81c4dac64d7a8d9e6
Author: Bharata B Rao <address@hidden>
Date:   Mon Aug 3 11:05:40 2015 +0530

    spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA

    When NUMA isn't configured explicitly, assume node 0 is present for
    the purpose of creating ibm,associativity-lookup-arrays property
    under ibm,dynamic-reconfiguration-memory DT node. This ensures that
    the associativity index property is correctly updated in ibm,dynamic-memory
    for the LMB that is hotplugged.
    
    Signed-off-by: Bharata B Rao <address@hidden>
    Reviewed-by: David Gibson <address@hidden>
    Signed-off-by: David Gibson <address@hidden>


So I guess ?: 1 is consistent with the longstanding assumption in spapr 
that the machine always has a "node 0", even for non-NUMA setups.

Maybe this logic should be consolidated in some helper for better
clarity.

> >   
> >> Thanks,
> >> Laurent  
> > 
> >   
> 
> 

Attachment: pgpWP4YJqLYoR.pgp
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]