Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=

From:	Alejandro Jimenez
Subject:	Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on
Date:	Wed, 25 May 2022 17:20:02 -0400
User-agent:	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1



On 5/25/2022 3:56 PM, Moger, Babu wrote:


On 5/24/22 18:23, Alejandro Jimenez wrote:

On 5/24/2022 3:48 PM, Moger, Babu wrote:


On 5/24/22 10:19, Igor Mammedov wrote:

On Tue, 24 May 2022 11:10:18 -0400
Igor Mammedov <imammedo@redhat.com> wrote:

CCing AMD folks as that might be of interest to them


I am trying to recreate the bug on my AMD system here.. Seeing this
message..

qemu-system-x86_64: -numa node,nodeid=0,memdev=ram-node0: memdev=ram-node0
is ambiguous

Here is my command line..

#qemu-system-x86_64 -name rhel8 -m 4096 -hda vdisk.qcow2 -enable-kvm -net
nic  -nographic -machine q35,accel=kvm -cpu
host,host-cache-info=on,l3-cache=off -smp
20,sockets=2,dies=1,cores=10,threads=1 -numa
node,nodeid=0,memdev=ram-node0 -numa node,nodeid=1,memdev=ram-node1 -numa
cpu,socket-id=0,node-id=0 -numa cpu,socket-id=1,node-id=1

Am I missing something?

Hi Babu,

Hopefully this will help you reproduce the issue if you are testing on
Milan/Genoa. Joao (CC'd) pointed out this warning to me late last year,
while I was working on patches for encoding the topology CPUID leaf in
different Zen platforms.

What I found from my experiments on Milan, is that the warning will
appear whenever the NUMA topology requested in QEMU cmdline assigns a
number of CPUs to each node that is smaller than the default # of CPUs
sharing a LLC on the host platform. In short, on a Milan host where we
have 16 CPUs sharing a CCX:


Yes. I recreated the issue with this following command line.

#qemu-system-x86_64 -name rhel8 -m 4096 -hda vdisk.qcow2 -enable-kvm -net
nic  -nographic -machine q35,accel=kvm -cpu host,+topoext -smp
16,sockets=1,dies=1,cores=16,threads=1 -object
memory-backend-ram,id=ram-node0,size=2G -object
memory-backend-ram,id=ram-node1,size=2G  -numa
node,nodeid=0,cpus=0-7,memdev=ram-node0 -numa
node,nodeid=1,cpus=8-15,memdev=ram-node1

But solving this will be bit complicated. For AMD, this information comes
from CPUID 0x8000001d. But, when this cpuid is being populated we don't
have all the information about numa nodes etc..

But you can work-around it by modifying the command line by including
dies(dies=2 in this case) information.  Something like this.

Makes sense; using dies=2 makes it so the cache topology leaf is builtwith 8cores/CCX, matching the # of NUMA nodes so all is well.


#qemu-system-x86_64 -name rhel8 -m 4096 -hda vdisk.qcow2 -enable-kvm -net
nic  -nographic -machine q35,accel=kvm -cpu
host,+topoext,host-cache-info=on -smp
16,sockets=1,dies=2,cores=8,threads=1 -object
memory-backend-ram,id=ram-node0,size=2G -object
memory-backend-ram,id=ram-node1,size=2G  -numa
node,nodeid=0,cpus=0-7,memdev=ram-node0 -numa
node,nodeid=1,cpus=8-15,memdev=ram-node1

But this may not be acceptable solution in all the cases.

This is not specific to host-cache-info behavior so it is probablybetter to discuss it separately. With that being said...

The idea that I considered was to automatically calculate a value of'dies' iff a explicit value was not requested via the '-smp' options,instead of just using the current default of dies=1. i.e. automaticallymimic the host cache topology in the guest so that if we are running onRome, the guest OS sees 4cores/CCX, but when running on Milan it sees8cores/CCX. This can be done by querying the host CPUID and using thatinfo to build the guest CPUID leaf in QEMU, similar to what Igor isdoing here but also adjusting the number of dies that is encoded.

I built prototype code that seemed to work correctly, but did notconsider the complication added by '-numa' options.

I think there is a much larger debate involved about what defaults are"sane", so rather than derailing this thread more, I'll send a follow upmessage in the future when I can take another look at the prototypepatches I have.


Thank you,
Alejandro


# cat /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list
0-7,128-135

If a guest is launched with the following arguments:

-cpu host,+topoext \
-smp cpus=64,cores=32,threads=2,sockets=1 \
-numa node,nodeid=0,cpus=0-7 -numa node,nodeid=1,cpus=8-15 \
-numa node,nodeid=2,cpus=16-23 -numa node,nodeid=3,cpus=24-31 \
-numa node,nodeid=4,cpus=32-39 -numa node,nodeid=5,cpus=40-47 \
-numa node,nodeid=6,cpus=48-55 -numa node,nodeid=7,cpus=56-63 \

it assigns 8 cpus to each NUMA node, causing the error above to be
displayed.

Note that ultimately the guest topology is built based on the NUMA
information, so the LLC domains on the guest only end up spanning a
single NUMA node. e.g.:

# cat /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list
0-7

Hope that helps,
Alejandro

Igor Mammedov (2):
    x86: cpu: make sure number of addressable IDs for processor cores
      meets the spec
    x86: cpu: fixup number of addressable IDs for logical processors
      sharing cache

   target/i386/cpu.c | 20 ++++++++++++++++----
   1 file changed, 16 insertions(+), 4 deletions(-)

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on, Igor Mammedov, 2022/05/24
- [PATCH 1/2] x86: cpu: make sure number of addressable IDs for processor cores meets the spec, Igor Mammedov, 2022/05/24
- [PATCH 2/2] x86: cpu: fixup number of addressable IDs for logical processors sharing cache, Igor Mammedov, 2022/05/24
- Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on, Igor Mammedov, 2022/05/24
  - Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on, Moger, Babu, 2022/05/24
    - Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on, Alejandro Jimenez, 2022/05/24
    - Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on, Moger, Babu, 2022/05/25
    - Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on, Alejandro Jimenez <=
    - Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on, Igor Mammedov, 2022/05/25
    - Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on, Moger, Babu, 2022/05/25
- Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on, Igor Mammedov, 2022/05/31

Prev by Date: Re: [PATCH v4 09/17] target/m68k: Fix stack frame for EXCP_ILLEGAL
Next by Date: Re: [PATCH v4 10/17] target/m68k: Implement TRAPcc
Previous by thread: Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on
Next by thread: Re: [PATCH 0/2] i386: fixup number of logical CPUs when host-cache-info=on
Index(es):
- Date
- Thread