qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] target/i386: fix pmovsx/pmovzx in-place operati


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH] target/i386: fix pmovsx/pmovzx in-place operations
Date: Wed, 9 Aug 2017 12:39:06 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 08/08/2017 22:21, Joseph Myers wrote:
> The SSE4.1 pmovsx* and pmovzx* instructions take packed 1-byte, 2-byte
> or 4-byte inputs and sign-extend or zero-extend them to a wider vector
> output.  The associated helpers for these instructions do the
> extension on each element in turn, starting with the lowest.  If the
> input and output are the same register, this means that all the input
> elements after the first have been overwritten before they are read.
> This patch makes the helpers extend starting with the highest element,
> not the lowest, to avoid such overwriting.  This fixes many GCC test
> failures (161 in the gcc testsuite in my GCC 6-based testing) when
> testing with a default CPU setting enabling those instructions.
> 
> Signed-off-by: Joseph Myers <address@hidden>
> 
> ---
> 
> diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
> index 16509d0..d578216 100644
> --- a/target/i386/ops_sse.h
> +++ b/target/i386/ops_sse.h
> @@ -1617,18 +1617,18 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg 
> *d, Reg *s)
>  #define SSE_HELPER_F(name, elem, num, F)        \
>      void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)     \
>      {                                           \
> -        d->elem(0) = F(0);                      \
> -        d->elem(1) = F(1);                      \
>          if (num > 2) {                          \
> -            d->elem(2) = F(2);                  \
> -            d->elem(3) = F(3);                  \
>              if (num > 4) {                      \
> -                d->elem(4) = F(4);              \
> -                d->elem(5) = F(5);              \
> -                d->elem(6) = F(6);              \
>                  d->elem(7) = F(7);              \
> +                d->elem(6) = F(6);              \
> +                d->elem(5) = F(5);              \
> +                d->elem(4) = F(4);              \
>              }                                   \
> +            d->elem(3) = F(3);                  \
> +            d->elem(2) = F(2);                  \
>          }                                       \
> +        d->elem(1) = F(1);                      \
> +        d->elem(0) = F(0);                      \
>      }
>  
>  SSE_HELPER_F(helper_pmovsxbw, W, 8, (int8_t) s->B)
> 

Queued, thanks.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]