qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug 1856335] Re: Cache Layout wrong on many Zen Arch CPUs


From: Heiko Sieger
Subject: [Bug 1856335] Re: Cache Layout wrong on many Zen Arch CPUs
Date: Mon, 18 May 2020 19:19:59 -0000

With regard to Jan's comment earlier and the virsh capabilities listing
the cores and siblings, also note the following lines from virsh
capabilities for a 3900X CPU:

    <cache>
      <bank id='0' level='3' type='both' size='16' unit='MiB' cpus='0-2,12-14'/>
      <bank id='1' level='3' type='both' size='16' unit='MiB' cpus='3-5,15-17'/>
      <bank id='2' level='3' type='both' size='16' unit='MiB' cpus='6-8,18-20'/>
      <bank id='3' level='3' type='both' size='16' unit='MiB' 
cpus='9-11,21-23'/>
    </cache>

virsh capabilities is perfectly able to identify the L3 cache structure
and associate the right cpus. It would be ideal to just use the above
output inside the libvirt domain configuration to "manually" define the
L3 cache, or something to that effect on the qemu command line.

Users could then decide to pin only part of the cpus, usually a multiple
of 6 (in the case of the 3900X) to align with the CCX.

I'm now on kernel 5.6.11 and QEMU v5.0.0.r533.gdebe78ce14-1 (from Arch
Linux AUR qemu-git), running q35-5.1. I will try the host-passthrough
with host-cache-info=on option Jan posted. Question - is host-cache-
info=on the same as <cache mode="passthrough"/> under <cpu mode=host-
passthrough...?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1856335

Title:
  Cache Layout wrong on many Zen Arch CPUs

Status in QEMU:
  New

Bug description:
  AMD CPUs have L3 cache per 2, 3 or 4 cores. Currently, TOPOEXT seems
  to always map Cache ass if it was an 4-Core per CCX CPU, which is
  incorrect, and costs upwards 30% performance (more realistically 10%)
  in L3 Cache Layout aware applications.

  Example on a 4-CCX CPU (1950X /w 8 Cores and no SMT):

    <cpu mode='custom' match='exact' check='full'>
      <model fallback='forbid'>EPYC-IBPB</model>
      <vendor>AMD</vendor>
      <topology sockets='1' cores='8' threads='1'/>

  In windows, coreinfo reports correctly:

  ****----  Unified Cache 1, Level 3,    8 MB, Assoc  16, LineSize  64
  ----****  Unified Cache 6, Level 3,    8 MB, Assoc  16, LineSize  64

  On a 3-CCX CPU (3960X /w 6 cores and no SMT):

   <cpu mode='custom' match='exact' check='full'>
      <model fallback='forbid'>EPYC-IBPB</model>
      <vendor>AMD</vendor>
      <topology sockets='1' cores='6' threads='1'/>

  in windows, coreinfo reports incorrectly:

  ****--  Unified Cache  1, Level 3,    8 MB, Assoc  16, LineSize  64
  ----**  Unified Cache  6, Level 3,    8 MB, Assoc  16, LineSize  64

  Validated against 3.0, 3.1, 4.1 and 4.2 versions of qemu-kvm.

  With newer Qemu there is a fix (that does behave correctly) in using the dies 
parameter:
   <qemu:arg value='cores=3,threads=1,dies=2,sockets=1'/>

  The problem is that the dies are exposed differently than how AMD does
  it natively, they are exposed to Windows as sockets, which means, that
  if you are nto a business user, you can't ever have a machine with
  more than two CCX (6 cores) as consumer versions of Windows only
  supports two sockets. (Should this be reported as a separate bug?)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1856335/+subscriptions



reply via email to

[Prev in Thread] Current Thread [Next in Thread]