[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug 1856335] Re: Cache Layout wrong on many Zen Arch CPUs
From: |
Damir |
Subject: |
[Bug 1856335] Re: Cache Layout wrong on many Zen Arch CPUs |
Date: |
Sun, 22 Dec 2019 10:09:16 -0000 |
Hi,
I've since confirmed that this bug also exist (as expected) on Linux
guests, as well as Zen1 EPYC 7401 CPUs, to make sure this wasn't a
problem with the detection of the newer consumer platform.
Basically it seems (looking at the code with layman eyes) that as long
as you have a topology that is dividable by 4 or 8, it will always
result in the wrong topology being exposed to the guest, even when the
correct option can be built (12, 24 core CPUs, although, it would be
great if we could support 9 Core VM CPus as that is a reasonable use
case for VMs (3 CCXs of 3 Cores for a total of 9 (or 18 SMT threads)).
Pinging the author and committer of the TopoEXT feature / EPYC cpu model
as they should probably know best how to solve this issue.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1856335
Title:
Cache Layout wrong on many Zen Arch CPUs
Status in QEMU:
New
Bug description:
AMD CPUs have L3 cache per 2, 3 or 4 cores. Currently, TOPOEXT seems
to always map Cache ass if it was an 4-Core per CCX CPU, which is
incorrect, and costs upwards 30% performance (more realistically 10%)
in L3 Cache Layout aware applications.
Example on a 4-CCX CPU (1950X /w 8 Cores and no SMT):
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>EPYC-IBPB</model>
<vendor>AMD</vendor>
<topology sockets='1' cores='8' threads='1'/>
In windows, coreinfo reports correctly:
****---- Unified Cache 1, Level 3, 8 MB, Assoc 16, LineSize 64
----**** Unified Cache 6, Level 3, 8 MB, Assoc 16, LineSize 64
On a 3-CCX CPU (3960X /w 6 cores and no SMT):
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>EPYC-IBPB</model>
<vendor>AMD</vendor>
<topology sockets='1' cores='6' threads='1'/>
in windows, coreinfo reports incorrectly:
****-- Unified Cache 1, Level 3, 8 MB, Assoc 16, LineSize 64
----** Unified Cache 6, Level 3, 8 MB, Assoc 16, LineSize 64
Validated against 3.0, 3.1, 4.1 and 4.2 versions of qemu-kvm.
With newer Qemu there is a fix (that does behave correctly) in using the dies
parameter:
<qemu:arg value='cores=3,threads=1,dies=2,sockets=1'/>
The problem is that the dies are exposed differently than how AMD does
it natively, they are exposed to Windows as sockets, which means, that
if you are nto a business user, you can't ever have a machine with
more than two CCX (6 cores) as consumer versions of Windows only
supports two sockets. (Should this be reported as a separate bug?)
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1856335/+subscriptions