Re: [PATCH v4 19/45] target/arm: Implement SME MOVA

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 19/45] target/arm: Implement SME MOVA

From:	Richard Henderson
Subject:	Re: [PATCH v4 19/45] target/arm: Implement SME MOVA
Date:	Mon, 4 Jul 2022 14:38:30 +0530
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1

On 7/1/22 21:49, Peter Maydell wrote:

On Tue, 28 Jun 2022 at 05:40, Richard Henderson
<richard.henderson@linaro.org> wrote:


We can reuse the SVE functions for implementing moves to/from
horizontal tile slices, but we need new ones for moves to/from
vertical tile slices.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---

diff --git a/target/arm/sme_helper.c b/target/arm/sme_helper.c
index eef2df73e1..95159862de 100644
--- a/target/arm/sme_helper.c
+++ b/target/arm/sme_helper.c
@@ -19,8 +19,10 @@

  #include "qemu/osdep.h"
  #include "cpu.h"
-#include "internals.h"


Did you mean to delete this #include line ?


I meant to go back and remove it from whence it came, but kept forgetting.

+void HELPER(sme_mova_cz_d)(void *za, void *vn, void *vg, uint32_t desc)
+{
+    int i, oprsz = simd_oprsz(desc) / 8;
+    uint8_t *pg = vg;
+    uint64_t *n = vn;
+    uint64_t *a = za;
+
+    for (i = 0; i < oprsz; i++) {
+        if (pg[H1_2(i)] & 1) {


The documentation of the H macros says
"The H1_<N> macros are used when performing byte arithmetic and then
  casting the final pointer to a type of size N."
but we're not casting anything to a 2-byte type, so what's happening here?


Yep, error.

+void HELPER(sme_mova_cz_q)(void *za, void *vn, void *vg, uint32_t desc)
+{
+    int i, oprsz = simd_oprsz(desc) / 16;
+    uint16_t *pg = vg;
+    Int128 *n = vn;
+    Int128 *a = za;
+
+    for (i = 0; i < oprsz; i++) {
+        if (pg[H2(i)] & 1) {
+            a[i * sizeof(ARMVectorReg)] = n[i];


Is it really OK to do this with an Int128 store? That is
in host-order, but the two halves of a 128-bit quantity
in the ZA array are in architectural order. I suppose the
source also will have them in the architectural order, but
it does look odd, especially uncommented.


Would memcpy be better for you?

+    /* Resolve tile.size[index] to an untyped ZA slice index. */
+    t_index = tcg_temp_new_i32();
+    tcg_gen_trunc_tl_i32(t_index, cpu_reg(s, rs));
+    tcg_gen_addi_i32(t_index, t_index, index);


This code isn't doing what the comment says; it's just calculating
the (not-yet-taken-MOD-dim) slice index, which does depend on the type.


I guess the comment applies to a larger section than just these two lines.

+
+    len = ctz32(streaming_vec_reg_size(s)) - esz;


What is this the length of ?


The length of the extract, aka the mod.

+        /* The tile slice offset within env->zarray is the column offset. */
+        offset = tile;


I don't understand why we can just add the tile index
(which is going to be an integer 0, 1, 2..) to a byte offset.
In the horizontal case we add tile * sizeof(ARMVectorReg),
which makes more sense to me.


Hmm.  I think you're right this should be tile * column width, or

  offset = tile << esz;

I wish I could compare vs FVP...

+        /*
+         * For big-endian, adjust the column slice offset within
+         * the uint64_t host words that make up env->zarray.
+         * This must wait until index and offset are combined.


Why? Neither the byte-offset of the start of the tile nor
the byte offset of zarray in CPUARMState ought to be non-8-aligned.

Columns will not be 8-aligned. On page 38 of 0616A.a, see the illustration of ZA0V.B[22],which is 6 mod 8.


I'll try and improve the commentary.


r~

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v4 19/45] target/arm: Implement SME MOVA, Peter Maydell, 2022/07/01
- Re: [PATCH v4 19/45] target/arm: Implement SME MOVA, Richard Henderson <=
  - Re: [PATCH v4 19/45] target/arm: Implement SME MOVA, Peter Maydell, 2022/07/04
    - Re: [PATCH v4 19/45] target/arm: Implement SME MOVA, Richard Henderson, 2022/07/04

Prev by Date: Re: [PATCH v4 0/4] hmat acpi: Don't require initiator value in -numa
Next by Date: Re: [PATCH 10/11] hw/intc/loongarch_ipi: Fix ipi device access of 64bits
Previous by thread: Re: [PATCH v4 19/45] target/arm: Implement SME MOVA
Next by thread: Re: [PATCH v4 19/45] target/arm: Implement SME MOVA
Index(es):
- Date
- Thread