qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] hw/386: Fix uninitialized memory with -device and CPU ho


From: Igor Mammedov
Subject: Re: [PATCH 1/2] hw/386: Fix uninitialized memory with -device and CPU hotplug
Date: Wed, 24 Jun 2020 15:47:35 +0200

On Tue, 16 Jun 2020 12:18:56 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Tuesday, June 16, 2020 5:59 AM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: pbonzini@redhat.com; rth@twiddle.net; ehabkost@redhat.com;
> > mst@redhat.com; marcel.apfelbaum@gmail.com; qemu-devel@nongnu.org
> > Subject: Re: [PATCH 1/2] hw/386: Fix uninitialized memory with -device and 
> > CPU
> > hotplug
> > 
> > On Mon, 08 Jun 2020 15:18:50 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > Noticed the following command failure while testing CPU hotplug.
> > >
> > > $ qemu-system-x86_64 -machine q35,accel=kvm -smp 1,maxcpus=2,
> > >   cores=1, threads=1,sockets=2 -cpu EPYC -device EPYC-x86_64-
> > >   cpu,core-id=0,socket-id=1,thread-id=0
> > >
> > >   qemu-system-x86_64: -device EPYC-x86_64-cpu,core-id=0,socket-id=1,
> > >   thread-id=0: Invalid CPU [socket: 21855, die: 0, core: 0, thread: 0]
> > >   with APIC ID 21855, valid index range 0:1
> > >
> > > This happens because APIC ID is calculated using uninitialized memory.
> > > This is happening after the addition of new field node_id in X86CPUTopoIDs
> > > structure. The node_id field is uninitialized while calling
> > > apicid_from_topo_ids. The problem is discussed in the thread below.
> > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.ker
> > nel.org%2Fqemu-
> > devel%2F20200602171838.GG577771%40habkost.net%2F&amp;data=02%7C01
> > %7Cbabu.moger%40amd.com%7C02200d75fd8b48d1955608d811e44f5b%7C3d
> > d8961fe4884e608e11a82d994e183d%7C0%7C0%7C637279019564311233&amp
> > ;sdata=ry3QO0Z5dxLPoRxkYVkOsVm3nl%2BxfCGv8be%2BMHdoUPY%3D&amp;r
> > eserved=0  
> > >
> > > Fix the problem by initializing the node_id properly.
> > >
> > > Signed-off-by: Babu Moger <babu.moger@amd.com>
> > > ---
> > >  hw/i386/pc.c               |    2 ++
> > >  include/hw/i386/topology.h |   11 +++++++++++
> > >  2 files changed, 13 insertions(+)
> > >
> > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > index 2128f3d6fe..974cc30891 100644
> > > --- a/hw/i386/pc.c
> > > +++ b/hw/i386/pc.c
> > > @@ -1585,6 +1585,8 @@ static void pc_cpu_pre_plug(HotplugHandler  
> > *hotplug_dev,  
> > >          topo_ids.die_id = cpu->die_id;
> > >          topo_ids.core_id = cpu->core_id;
> > >          topo_ids.smt_id = cpu->thread_id;
> > > +        topo_ids.node_id = 
> > > cpu_x86_use_epyc_apic_id_encoding(ms->cpu_type)  
> > ?  
> > > +                           x86_node_id_for_epyc(&topo_info, &topo_ids) : 
> > > 0;  
> > 
> > I'd rather not calculate some default value here,
> > this is the branch where we check user provided topology info and error out
> > asking
> > to provide missing bits.  
> Noticed that cpu->node_id is initialized to 0xFF(NUMA_NODE_UNASSIGNED).
> We can initialize cpu->node_id to default node like how we do it in
> x86_get_default_cpu_node_id.  We can use it to initialize topo_ids.node_id.
> This is consistent with other fields core_id, die_id etc..
> 
> > 
> > I also wonder if we should force user to specify numa nodes on CLI if EPYC 
> > cpu is
> > used.
> > (i.e. I'm assuming that EPYC always requires numa)  
> 
> That is not true. Without numa all the cpus will be configured under one
> default numa node 0. Like we do it using x86_get_default_cpu_node_id.

get_default_cpu_node_id() which is making things up, is going to be removed
eventually in favor of asking user to provide numa mapping explicitly on CLI.

now if it's always only node 0, why do we need to calculate it then,
why not just assing 0 directly?

what if we have several sockets, would all vCPUs still have node-id = 0?

Another question is if we have CPUs with only 1 numa node set on all of them,
does it require all other machinery like ACPI SRAT table defined or it's just
internal CPU impl. detail?


> >   
> > >          cpu->apic_id = x86ms->apicid_from_topo_ids(&topo_info, 
> > > &topo_ids);
> > >      }
> > >
> > > diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
> > > index 07239f95f4..ee4deb84c4 100644
> > > --- a/include/hw/i386/topology.h
> > > +++ b/include/hw/i386/topology.h
> > > @@ -140,6 +140,17 @@ static inline unsigned  
> > apicid_pkg_offset_epyc(X86CPUTopoInfo *topo_info)  
> > >             apicid_node_width_epyc(topo_info);
> > >  }
> > >
> > > +static inline unsigned x86_node_id_for_epyc(X86CPUTopoInfo *topo_info,
> > > +                                            const X86CPUTopoIDs 
> > > *topo_ids)
> > > +{
> > > +    unsigned nr_nodes = MAX(topo_info->nodes_per_pkg, 1);
> > > +    unsigned cores_per_node = DIV_ROUND_UP((topo_info->dies_per_pkg *
> > > +                                            topo_info->cores_per_die *
> > > +                                            topo_info->threads_per_core),
> > > +                                            nr_nodes);
> > > +
> > > +    return (topo_ids->core_id / cores_per_node) % nr_nodes;  
> > what if nr_nodes == 0?
> >   
> > > +}
> > >  /*
> > >   * Make APIC ID for the CPU based on Pkg_ID, Core_ID, SMT_ID
> > >   *
> > >
> > >  
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]