[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug 1856335] Re: Cache Layout wrong on many Zen Arch CPUs
From: |
Heiko Sieger |
Subject: |
[Bug 1856335] Re: Cache Layout wrong on many Zen Arch CPUs |
Date: |
Wed, 15 Apr 2020 20:46:00 -0000 |
Same problem for Ryzen 9 3900X. There should be 4x L3 caches, but there
are only 3.
Same results with "host-passthrough" and "EPYC-IBPB". Windows doesn't
recognize the correct L3 cache layout.
>From coreinfo.exe:
Logical Processor to Cache Map:
**---------------------- Data Cache 0, Level 1, 32 KB, Assoc 8,
LineSize 64
**---------------------- Instruction Cache 0, Level 1, 32 KB, Assoc 8,
LineSize 64
**---------------------- Unified Cache 0, Level 2, 512 KB, Assoc 8,
LineSize 64
********---------------- Unified Cache 1, Level 3, 16 MB, Assoc 16,
LineSize 64
--**-------------------- Data Cache 1, Level 1, 32 KB, Assoc 8,
LineSize 64
--**-------------------- Instruction Cache 1, Level 1, 32 KB, Assoc 8,
LineSize 64
--**-------------------- Unified Cache 2, Level 2, 512 KB, Assoc 8,
LineSize 64
----**------------------ Data Cache 2, Level 1, 32 KB, Assoc 8,
LineSize 64
----**------------------ Instruction Cache 2, Level 1, 32 KB, Assoc 8,
LineSize 64
----**------------------ Unified Cache 3, Level 2, 512 KB, Assoc 8,
LineSize 64
------**---------------- Data Cache 3, Level 1, 32 KB, Assoc 8,
LineSize 64
------**---------------- Instruction Cache 3, Level 1, 32 KB, Assoc 8,
LineSize 64
------**---------------- Unified Cache 4, Level 2, 512 KB, Assoc 8,
LineSize 64
--------**-------------- Data Cache 4, Level 1, 32 KB, Assoc 8,
LineSize 64
--------**-------------- Instruction Cache 4, Level 1, 32 KB, Assoc 8,
LineSize 64
--------**-------------- Unified Cache 5, Level 2, 512 KB, Assoc 8,
LineSize 64
--------********-------- Unified Cache 6, Level 3, 16 MB, Assoc 16,
LineSize 64
----------**------------ Data Cache 5, Level 1, 32 KB, Assoc 8,
LineSize 64
----------**------------ Instruction Cache 5, Level 1, 32 KB, Assoc 8,
LineSize 64
----------**------------ Unified Cache 7, Level 2, 512 KB, Assoc 8,
LineSize 64
------------**---------- Data Cache 6, Level 1, 32 KB, Assoc 8,
LineSize 64
------------**---------- Instruction Cache 6, Level 1, 32 KB, Assoc 8,
LineSize 64
------------**---------- Unified Cache 8, Level 2, 512 KB, Assoc 8,
LineSize 64
--------------**-------- Data Cache 7, Level 1, 32 KB, Assoc 8,
LineSize 64
--------------**-------- Instruction Cache 7, Level 1, 32 KB, Assoc 8,
LineSize 64
--------------**-------- Unified Cache 9, Level 2, 512 KB, Assoc 8,
LineSize 64
----------------**------ Data Cache 8, Level 1, 32 KB, Assoc 8,
LineSize 64
----------------**------ Instruction Cache 8, Level 1, 32 KB, Assoc 8,
LineSize 64
----------------**------ Unified Cache 10, Level 2, 512 KB, Assoc 8,
LineSize 64
----------------******** Unified Cache 11, Level 3, 16 MB, Assoc 16,
LineSize 64
------------------**---- Data Cache 9, Level 1, 32 KB, Assoc 8,
LineSize 64
------------------**---- Instruction Cache 9, Level 1, 32 KB, Assoc 8,
LineSize 64
------------------**---- Unified Cache 12, Level 2, 512 KB, Assoc 8,
LineSize 64
--------------------**-- Data Cache 10, Level 1, 32 KB, Assoc 8,
LineSize 64
--------------------**-- Instruction Cache 10, Level 1, 32 KB, Assoc 8,
LineSize 64
--------------------**-- Unified Cache 13, Level 2, 512 KB, Assoc 8,
LineSize 64
----------------------** Data Cache 11, Level 1, 32 KB, Assoc 8,
LineSize 64
----------------------** Instruction Cache 11, Level 1, 32 KB, Assoc 8,
LineSize 64
----------------------** Unified Cache 14, Level 2, 512 KB, Assoc 8,
LineSize 64
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1856335
Title:
Cache Layout wrong on many Zen Arch CPUs
Status in QEMU:
New
Bug description:
AMD CPUs have L3 cache per 2, 3 or 4 cores. Currently, TOPOEXT seems
to always map Cache ass if it was an 4-Core per CCX CPU, which is
incorrect, and costs upwards 30% performance (more realistically 10%)
in L3 Cache Layout aware applications.
Example on a 4-CCX CPU (1950X /w 8 Cores and no SMT):
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>EPYC-IBPB</model>
<vendor>AMD</vendor>
<topology sockets='1' cores='8' threads='1'/>
In windows, coreinfo reports correctly:
****---- Unified Cache 1, Level 3, 8 MB, Assoc 16, LineSize 64
----**** Unified Cache 6, Level 3, 8 MB, Assoc 16, LineSize 64
On a 3-CCX CPU (3960X /w 6 cores and no SMT):
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>EPYC-IBPB</model>
<vendor>AMD</vendor>
<topology sockets='1' cores='6' threads='1'/>
in windows, coreinfo reports incorrectly:
****-- Unified Cache 1, Level 3, 8 MB, Assoc 16, LineSize 64
----** Unified Cache 6, Level 3, 8 MB, Assoc 16, LineSize 64
Validated against 3.0, 3.1, 4.1 and 4.2 versions of qemu-kvm.
With newer Qemu there is a fix (that does behave correctly) in using the dies
parameter:
<qemu:arg value='cores=3,threads=1,dies=2,sockets=1'/>
The problem is that the dies are exposed differently than how AMD does
it natively, they are exposed to Windows as sockets, which means, that
if you are nto a business user, you can't ever have a machine with
more than two CCX (6 cores) as consumer versions of Windows only
supports two sockets. (Should this be reported as a separate bug?)
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1856335/+subscriptions
- [Bug 1856335] Re: Cache Layout wrong on many Zen Arch CPUs,
Heiko Sieger <=