[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 3/4] target/arm: Fixup SIMD fcmla(by element) in 4H arrangeme
From: |
Richard Henderson |
Subject: |
Re: [PATCH 3/4] target/arm: Fixup SIMD fcmla(by element) in 4H arrangement |
Date: |
Tue, 8 Dec 2020 15:04:16 -0600 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
On 12/6/20 10:46 PM, LIU Zhiwei wrote:
> For SIMD fcmla(by element), if the number of elements is less than
> the number of elements within one segment,i.e. 4H arrangement,
> we should not calculate the entire segment.
>
> Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
> ---
> target/arm/vec_helper.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
> index 7174030377..44b8165323 100644
> --- a/target/arm/vec_helper.c
> +++ b/target/arm/vec_helper.c
> @@ -544,6 +544,10 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void
> *vm,
> neg_real <<= 15;
> neg_imag <<= 15;
>
> + /* Adjust eltspersegment for simd 4H */
> + if (eltspersegment > elements) {
> + eltspersegment = elements;
> + }
Ok. Maybe better to fold this back to the initialization using MIN.
> for (i = 0; i < elements; i += eltspersegment) {
> float16 mr = m[H2(i + 2 * index + 0)];
> float16 mi = m[H2(i + 2 * index + 1)];
> @@ -610,6 +614,10 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void
> *vm,
> neg_real <<= 31;
> neg_imag <<= 31;
>
> + /* Adjust eltspersegment for simd 4H */
> + if (eltspersegment > elements) {
> + eltspersegment = elements;
> + }
Incorrect: this function only computes 4S.
> for (i = 0; i < elements; i += eltspersegment) {
> float32 mr = m[H4(i + 2 * index + 0)];
> float32 mi = m[H4(i + 2 * index + 1)];
>
r~