Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementatio

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementatio

From:	David Gibson
Subject:	Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation
Date:	Thu, 29 Sep 2016 11:38:41 +1000
User-agent:	Mutt/1.7.0 (2016-08-17)

On Wed, Sep 28, 2016 at 11:01:22AM +0530, Nikunj A Dadhania wrote:
> Load 8byte at a time and manipulate.
> 
> Big-Endian Storage
> +-------------+-------------+-------------+-------------+
> | 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF |
> +-------------+-------------+-------------+-------------+
> 
> Little-Endian Storage
> +-------------+-------------+-------------+-------------+
> | 33 22 11 00 | 77 66 55 44 | BB AA 99 88 | FF EE DD CC |
> +-------------+-------------+-------------+-------------+
> 
> Vector load results in:
> +-------------+-------------+-------------+-------------+
> | 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF |
> +-------------+-------------+-------------+-------------+

Ok.  I'm guessing from this that implementing those GPR<->VSR
instructions showed that the earlier versions were endian-incorrect as
I suspected.

Have you verified that this new implementation is actually faster (or
at least no slower) on LE than the original implementation with
individual 32-bit stores?

> Signed-off-by: Nikunj A Dadhania <address@hidden>
> ---
>  target-ppc/translate/vsx-impl.inc.c | 33 +++++++++++++++++++--------------
>  1 file changed, 19 insertions(+), 14 deletions(-)
> 
> diff --git a/target-ppc/translate/vsx-impl.inc.c 
> b/target-ppc/translate/vsx-impl.inc.c
> index 74d0533..1eca042 100644
> --- a/target-ppc/translate/vsx-impl.inc.c
> +++ b/target-ppc/translate/vsx-impl.inc.c
> @@ -75,7 +75,6 @@ static void gen_lxvdsx(DisasContext *ctx)
>  static void gen_lxvw4x(DisasContext *ctx)
>  {
>      TCGv EA;
> -    TCGv_i64 tmp;
>      TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
>      TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
>      if (unlikely(!ctx->vsx_enabled)) {
> @@ -84,22 +83,28 @@ static void gen_lxvw4x(DisasContext *ctx)
>      }
>      gen_set_access_type(ctx, ACCESS_INT);
>      EA = tcg_temp_new();
> -    tmp = tcg_temp_new_i64();
>  
>      gen_addr_reg_index(ctx, EA);
> -    gen_qemu_ld32u_i64(ctx, tmp, EA);
> -    tcg_gen_addi_tl(EA, EA, 4);
> -    gen_qemu_ld32u_i64(ctx, xth, EA);
> -    tcg_gen_deposit_i64(xth, xth, tmp, 32, 32);
> -
> -    tcg_gen_addi_tl(EA, EA, 4);
> -    gen_qemu_ld32u_i64(ctx, tmp, EA);
> -    tcg_gen_addi_tl(EA, EA, 4);
> -    gen_qemu_ld32u_i64(ctx, xtl, EA);
> -    tcg_gen_deposit_i64(xtl, xtl, tmp, 32, 32);
> -
> +    if (ctx->le_mode) {
> +        TCGv_i64 t0, t1;
> +
> +        t0 = tcg_temp_new_i64();
> +        t1 = tcg_temp_new_i64();
> +        tcg_gen_qemu_ld_i64(t0, EA, ctx->mem_idx, MO_LEQ);
> +        tcg_gen_shri_i64(t1, t0, 32);
> +        tcg_gen_deposit_i64(xth, t1, t0, 32, 32);
> +        tcg_gen_addi_tl(EA, EA, 8);
> +        tcg_gen_qemu_ld_i64(t0, EA, ctx->mem_idx, MO_LEQ);
> +        tcg_gen_shri_i64(t1, t0, 32);
> +        tcg_gen_deposit_i64(xtl, t1, t0, 32, 32);
> +        tcg_temp_free_i64(t0);
> +        tcg_temp_free_i64(t1);
> +    } else {
> +        tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEQ);
> +        tcg_gen_addi_tl(EA, EA, 8);
> +        tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEQ);
> +    }
>      tcg_temp_free(EA);
> -    tcg_temp_free_i64(tmp);
>  }
>  
>  #define VSX_STORE_SCALAR(name, operation)                     \

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

signature.asc
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v4 0/9] POWER9 TCG enablements - part4, Nikunj A Dadhania, 2016/09/28
- [Qemu-devel] [PATCH v4 2/9] target-ppc: Implement mtvsrdd instruction, Nikunj A Dadhania, 2016/09/28
  - Re: [Qemu-devel] [PATCH v4 2/9] target-ppc: Implement mtvsrdd instruction, Richard Henderson, 2016/09/28
    - Re: [Qemu-devel] [PATCH v4 2/9] target-ppc: Implement mtvsrdd instruction, Nikunj A Dadhania, 2016/09/28
  - Re: [Qemu-devel] [PATCH v4 2/9] target-ppc: Implement mtvsrdd instruction, David Gibson, 2016/09/28
    - Re: [Qemu-devel] [PATCH v4 2/9] target-ppc: Implement mtvsrdd instruction, Nikunj A Dadhania, 2016/09/28
- [Qemu-devel] [PATCH v4 1/9] target-ppc: Implement mfvsrld instruction, Nikunj A Dadhania, 2016/09/28
  - Re: [Qemu-devel] [PATCH v4 1/9] target-ppc: Implement mfvsrld instruction, Richard Henderson, 2016/09/28
- [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation, Nikunj A Dadhania, 2016/09/28
  - Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation, Richard Henderson, 2016/09/28
  - Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation, David Gibson <=
    - Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation, Nikunj A Dadhania, 2016/09/28
    - Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation, Nikunj A Dadhania, 2016/09/28
    - Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation, Richard Henderson, 2016/09/28
    - Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation, David Gibson, 2016/09/29
    - Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation, David Gibson, 2016/09/29
- [Qemu-devel] [PATCH v4 3/9] target-ppc: Implement mtvsrws instruction, Nikunj A Dadhania, 2016/09/28
  - Re: [Qemu-devel] [PATCH v4 3/9] target-ppc: Implement mtvsrws instruction, Richard Henderson, 2016/09/28
- [Qemu-devel] [PATCH v4 5/9] target-ppc: improve stxvw4x implementation, Nikunj A Dadhania, 2016/09/28
  - Re: [Qemu-devel] [PATCH v4 5/9] target-ppc: improve stxvw4x implementation, Richard Henderson, 2016/09/28
- [Qemu-devel] [PATCH v4 8/9] target-ppc: add lxvb16x instruction, Nikunj A Dadhania, 2016/09/28

Prev by Date: Re: [Qemu-devel] [PATCH 5/6] target-ppc: add vector compare not equal instructions
Next by Date: Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation
Previous by thread: Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation
Next by thread: Re: [Qemu-devel] [PATCH v4 4/9] target-ppc: improve lxvw4x implementation
Index(es):
- Date
- Thread