On Wed, Sep 21, 2022 at 03:51:42PM +0100, Dr. David Alan Gilbert wrote:
* Wang, Lei (lei4.wang@intel.com) wrote:
The new CPU model mostly inherits features from Icelake-Server, while
adding new features:
- AMX (Advance Matrix eXtensions)
- Bus Lock Debug Exception
and new instructions:
- AVX VNNI (Vector Neural Network Instruction):
- VPDPBUS: Multiply and Add Unsigned and Signed Bytes
- VPDPBUSDS: Multiply and Add Unsigned and Signed Bytes with Saturation
- VPDPWSSD: Multiply and Add Signed Word Integers
- VPDPWSSDS: Multiply and Add Signed Integers with Saturation
- FP16: Replicates existing AVX512 computational SP (FP32) instructions
using FP16 instead of FP32 for ~2X performance gain
- SERIALIZE: Provide software with a simple way to force the processor to
complete all modifications, faster, allowed in all privilege levels and
not causing an unconditional VM exit
- TSX Suspend Load Address Tracking: Allows programmers to choose which
memory accesses do not need to be tracked in the TSX read set
- AVX512_BF16: Vector Neural Network Instructions supporting BFLOAT16
inputs and conversion instructions from IEEE single precision
Features may be added in future versions:
- CET (virtualization support hasn't been merged)
Instructions may be added in future versions:
- fast zero-length MOVSB (KVM doesn't support yet)
- fast short STOSB (KVM doesn't support yet)
- fast short CMPSB, SCASB (KVM doesn't support yet)
Signed-off-by: Wang, Lei <lei4.wang@intel.com>
Reviewed-by: Robert Hoo <robert.hu@linux.intel.com>
Hi,
What fills in the AMX tile and tmul information leafs
(0x1D, 0x1E)?
In particular, how would we make sure when we migrate between two
generations of AMX/Tile/Tmul capable devices with different
register/palette/tmul limits that the migration is tied to the CPU type
correctly?
Would you expect all devices called a 'SappireRapids' to have the same
sizes?
There is only one palette in current design. This palette include 8
tiles. Those two CPUID leafs defined bytes_per_tile, total_tile_bytes,
max_rows and etc, the AMX tool will configure those values into TILECFG with
ldtilecfg instrcutions. Once tiles are configured, we can use
tileload instruction to load data into those tiles.
We did migration between two SappireRapids with amx self test tool
(tools/testing/selftests/x86/amx.c)started in two sides, the migration
work well.
As for SappireRapids and more newer cpu types, those two CPUID leafs
definitions are all same on AMX.