[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/2] target/arm: Vectorize USHL and SSHL
From: |
Peter Maydell |
Subject: |
Re: [Qemu-devel] [PATCH 1/2] target/arm: Vectorize USHL and SSHL |
Date: |
Thu, 23 May 2019 14:03:22 +0100 |
On Thu, 23 May 2019 at 13:44, Peter Maydell <address@hidden> wrote:
>
> On Sat, 18 May 2019 at 20:19, Richard Henderson
> <address@hidden> wrote:
> >
> > These instructions shift left or right depending on the sign
> > of the input, and 7 bits are significant to the shift. This
> > requires several masks and selects in addition to the actual
> > shifts to form the complete answer.
> >
> > That said, the operation is still a small improvement even for
> > two 64-bit elements -- 13 vector operations instead of 2 * 7
> > integer operations.
> >
> > +void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b)
> > +{
> > + TCGv_i32 lval = tcg_temp_new_i32();
> > + TCGv_i32 rval = tcg_temp_new_i32();
> > + TCGv_i32 lsh = tcg_temp_new_i32();
> > + TCGv_i32 rsh = tcg_temp_new_i32();
> > + TCGv_i32 zero = tcg_const_i32(0);
> > + TCGv_i32 max = tcg_const_i32(32);
> > +
> > + /*
> > + * Perform possibly out of range shifts, trusting that the operation
> > + * does not trap. Discard unused results after the fact.
> > + */
>
> This comment reads to me like we're relying on a guarantee
> that TCG doesn't make, but in fact the readme says it does:
> out-of-range shifts are "unspecified behavior" which may give
> bogus results but won't crash. Perhaps phrasing the comment
> as "relying on the TCG guarantee that these are only
> 'unspecified behavior' and not 'undefined behavior' and so
> won't crash" would be clearer ?
I had a look through the rest of the patch, but there is
too much code here and I don't have enough context to
figure out how all the various new gvec helpers are
called and what jobs they are doing compared to the
actual instruction operation. Maybe I'll have another try later.
Why can we get rid of the 8-bit, 32-bit and 64-bit old shift
helpers, but not 16-bit ?
thanks
-- PMM