[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] i386: Add new CPU model SapphireRapids
From: |
Yang Zhong |
Subject: |
Re: [PATCH] i386: Add new CPU model SapphireRapids |
Date: |
Wed, 28 Sep 2022 04:12:32 -0400 |
On Mon, Sep 26, 2022 at 09:51:13AM +0100, Dr. David Alan Gilbert wrote:
> * Yang Zhong (yang.zhong@linux.intel.com) wrote:
> > On Sat, Sep 24, 2022 at 12:01:16AM +0800, Xiaoyao Li wrote:
> > > On 9/23/2022 9:30 PM, Yang Zhong wrote:
> > > > On Wed, Sep 21, 2022 at 03:51:42PM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Wang, Lei (lei4.wang@intel.com) wrote:
> > > > > > The new CPU model mostly inherits features from Icelake-Server,
> > > > > > while
> > > > > > adding new features:
> > > > > > - AMX (Advance Matrix eXtensions)
> > > > > > - Bus Lock Debug Exception
> > > > > > and new instructions:
> > > > > > - AVX VNNI (Vector Neural Network Instruction):
> > > > > > - VPDPBUS: Multiply and Add Unsigned and Signed Bytes
> > > > > > - VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with
> > > > > > Saturation
> > > > > > - VPDPWSSD: Multiply and Add Signed Word Integers
> > > > > > - VPDPWSSDS: Multiply and Add Signed Integers with Saturation
> > > > > > - FP16: Replicates existing AVX512 computational SP (FP32)
> > > > > > instructions
> > > > > > using FP16 instead of FP32 for ~2X performance gain
> > > > > > - SERIALIZE: Provide software with a simple way to force the
> > > > > > processor to
> > > > > > complete all modifications, faster, allowed in all privilege
> > > > > > levels and
> > > > > > not causing an unconditional VM exit
> > > > > > - TSX Suspend Load Address Tracking: Allows programmers to choose
> > > > > > which
> > > > > > memory accesses do not need to be tracked in the TSX read set
> > > > > > - AVX512_BF16: Vector Neural Network Instructions supporting
> > > > > > BFLOAT16
> > > > > > inputs and conversion instructions from IEEE single precision
> > > > > >
> > > > > > Features may be added in future versions:
> > > > > > - CET (virtualization support hasn't been merged)
> > > > > > Instructions may be added in future versions:
> > > > > > - fast zero-length MOVSB (KVM doesn't support yet)
> > > > > > - fast short STOSB (KVM doesn't support yet)
> > > > > > - fast short CMPSB, SCASB (KVM doesn't support yet)
> > > > > >
> > > > > > Signed-off-by: Wang, Lei <lei4.wang@intel.com>
> > > > > > Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
> > > > >
> > > > > Hi,
> > > > > What fills in the AMX tile and tmul information leafs
> > > > > (0x1D, 0x1E)?
> > > > > In particular, how would we make sure when we migrate between two
> > > > > generations of AMX/Tile/Tmul capable devices with different
> > > > > register/palette/tmul limits that the migration is tied to the CPU
> > > > > type
> > > > > correctly?
> > > > > Would you expect all devices called a 'SappireRapids' to have the
> > > > > same
> > > > > sizes?
> > > > >
> > > >
> > > > There is only one palette in current design. This palette include 8
> > > > tiles. Those two CPUID leafs defined bytes_per_tile,
> > > > total_tile_bytes,
> > > > max_rows and etc, the AMX tool will configure those values into
> > > > TILECFG with
> > > > ldtilecfg instrcutions. Once tiles are configured, we can use
> > > > tileload instruction to load data into those tiles.
> > > >
> > > > We did migration between two SappireRapids with amx self test tool
> > > > (tools/testing/selftests/x86/amx.c)started in two sides, the
> > > > migration
> > > > work well.
> > > >
> > > > As for SappireRapids and more newer cpu types, those two CPUID leafs
> > > > definitions are all same on AMX.
> > >
> > > I'm not sure what definitions mean here. Are you saying the CPUID values
> > > of
> > > leaf 0x1D and 0x1E won't change for any future Intel Silicion?
> > >
> > > Personally, I doubt it. And we shouldn't take such assumption unless Intel
> > > states it SDM.
> >
> > The current 0x1D and 0x1E definitions as below:
> >
> > /* CPUID Leaf 0x1D constants: */
> > #define INTEL_AMX_TILE_MAX_SUBLEAF 0x1
> > #define INTEL_AMX_TOTAL_TILE_BYTES 0x2000
> > #define INTEL_AMX_BYTES_PER_TILE 0x400
> > #define INTEL_AMX_BYTES_PER_ROW 0x40
> > #define INTEL_AMX_TILE_MAX_NAMES 0x8
> > #define INTEL_AMX_TILE_MAX_ROWS 0x10
> >
> > /* CPUID Leaf 0x1E constants: */
> > #define INTEL_AMX_TMUL_MAX_K 0x10
> > #define INTEL_AMX_TMUL_MAX_N 0x40
> >
> > These values are defined from SDM, and from the new developping CPU,
> > these values are still same with SappireRapids. thanks!
>
> But there's nothing stopping them increasing in future versions ?
>
Okay, thanks! We will add these CPUID leafs in this cpu model.
Yang
> Dave
>
> > Yang
> > >
> > > > So, on AMX perspective, the migration
> > > > should be workable on subsequent cpu types. thanks!
> > >
> > > I think what Dave worried is that when migrating one VM created with
> > > "SapphireRapids" model on SPR machine to some newer platform in the
> > > future,
> > > where the newer platform reports different value on CPUID leaves 0x1D and
> > > 0x1E than SPR platform.
> > >
> > > I think we need to contain CPUID leaves 0x1D and 0x1E into CPU model as
> > > well. Otherwise we will hit the same as Intel PT that SPR reports less
> > > capabilities that ICX.
> > >
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>