lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Idea : load/stores with pre-decrement / post-increment


From: Paul Cercueil
Subject: Re: Idea : load/stores with pre-decrement / post-increment
Date: Thu, 21 Dec 2023 20:20:41 +0100

Hi Paulo,

Le jeudi 21 décembre 2023 à 09:05 -0300, Paulo César Pereira de Andrade
a écrit :
> Em qui., 21 de dez. de 2023 às 07:23, Paul Cercueil
> <paul@crapouillou.net> escreveu:
> 
> [SNIP]
> 
> > >   So, the idea is the pattern:
> > > 
> > > jit_ldxbr_T(R0, R1, DISP), jit_ldxar_T(R0, R1, DISP)
> > > jit_stxbr_T(R0, R1, DISP) and jit_stxar_T(R0, R1, DISP)
> > > 
> > > where the fallback/generic version does addi of DISP in the base
> > > register (b)efore
> > > or (a)fter the load and otherwise is a normal jit_ldr_T or
> > > jit_str_T.
> > 
> > Do we need DISP? In your examples above it's always equal to
> > sizeof(T),
> > (or I guess the negative sizeof(T) as well) and that would be my
> > assumption. I'm not against it, but it sounds a bit error-prone, as
> > well as redundant since the suffix _c _s _i already tells you the
> > increment/decrement value. Unless you want to accept arbitrary
> > increment/decrement values (as e.g. ARM supports that) but such
> > usage
> > wouldn't be very typical.
> > 
> > On the other hand... I like that it supports all cases with just 4
> > new
> > instructions.
> 
>   I might still change it. But current WIP test implementation is:
> 
> ldxbi_T r0 r1 IM
> ldxai_T r0 r1 IM
> stxbi_T IM r1 r0
> stxai_T IM r1 r0
> 
>   That is, it matches the pattern of the existing ldxi_T and stxi_T.
> T
> is the usual c, uc, s, us, i, ui, l, f and d. IM is an immediate
> integer.
>   This leaves room for a ldxbr_T, ldxar_T, stxbr_T and stxar_T, that
> might be implemented based on what supported ports provide and
> if any port has a version with a variable value to increment the base
> register.
>   The IM value can be positive or negative, and can be useful in
> iterations in a vector of complex types where the loop iteration
> adjusts
> the base pointer and loads the element at some offset in a single
> instruction.
>   At least arm64 will support it (not certain about float types) in a
> single
> instruction. Basically it is an indexed load that also pre or post
> increments
> the base register.

Ok, that's fine.

Thanks for working on it!

Cheers,
-Paul



reply via email to

[Prev in Thread] Current Thread [Next in Thread]