lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unaligned load/store opcodes


From: Paul Cercueil
Subject: Re: Unaligned load/store opcodes
Date: Tue, 28 Mar 2023 13:21:02 +0200

Hi Paulo,

Le mardi 28 mars 2023 à 07:26 -0300, Paulo César Pereira de Andrade a
écrit :
> Em seg., 27 de mar. de 2023 às 19:17, Paul Cercueil
> <paul@crapouillou.net> escreveu:
> > 
> > Hi Paulo,
> > 
> > Le lundi 27 mars 2023 à 12:14 -0300, Paulo César Pereira de Andrade
> > a
> > écrit :
> > > Em qui., 23 de mar. de 2023 às 13:50, Paulo César Pereira de
> > > Andrade
> > > <paulo.cesar.pereira.de.andrade@gmail.com> escreveu:
> > > > 
> > > > Em qui., 23 de mar. de 2023 às 08:07, Paul Cercueil
> > > > <paul@crapouillou.net> escreveu:
> > > > > 
> > > > > Hi Paulo,
> > > > 
> > > >   Hi Paul,
> > > > 
> > > > > I think Lightning would benefit from having support for
> > > > > 16/32/64-
> > > > > bit
> > > > > I/O to unaligned addresses. That's something I would actually
> > > > > use.
> > > > > 
> > > > > Something like:
> > > > > ldur_s / ldur_us / ldur_i / ldur_ui / ldur_l
> > > > > stur_s / stur_i / stur_l
> > > > 
> > > >   These can be added and fallbacks are mostly trivial.
> > > > 
> > > > > I don't think we need ldx/stx variants.
> > > > 
> > > >   For completeness, and unless there is an specialized version
> > > > for
> > > > ldx/stx a simple wrapper adding register values is easy.
> > > 
> > >   Using named versions would use too many jit_code_t values
> > > for a complete set of something that has very few special use
> > > cases.
> > > 
> > > > > What do you think?
> > > > 
> > > >   Most cpus have some kind of help for unaligned read, or just
> > > > transparently allow it, but slower load/store.
> > > 
> > >   Maybe we could think of something like:
> > > 
> > > unldr   output base size
> > > unldi   output base size
> > > unldr_u output base size
> > > unldi_u output base size
> > > unsti   base output size
> > > unstr   base output size
> > > 
> > > and could be useful:
> > > 
> > > unldr_f output base
> > > unldi_d output base
> > > unstr_f base output
> > > unsti_d base output
> > > 
> > >   The versions with a register base could have an extra immediate
> > > offset argument. But for consistency, better to not have this
> > > extra
> > > immediate.
> > > 
> > >   Since only bytes are addressable, size would be in bytes, and
> > > would
> > > also allow words of 3 bytes, and 5, 6 and 7 bytes for 64 bit.
> > 
> > Do we really need this? ... The unaligned load/store would be
> > useful
> > for loading from unaligned fields in a C struct, for instance, but
> > the
> > fields themselves are always 1/2/4/8 bytes, so I don't know in
> > which
> > case you would need to load 3/5/6/7-byte "words".
> 
>   I should have written 3/5/6/7-byte integers.
> 
>   Updating to use a single _f modifier, for the case of 1/2/3/5/6/7-
> byte floats,
> it would be an assertion to use these, but would leave room for it in
> a possible future.
> 
>   The nonstandard integer and float would not be much useful, just
> that
> they would be easy to add to the suggested abi. They might be useful
> for some language or abstraction using lightning.
> 
> > >   The float and double ones are just for convenience, and in most
> > > cases
> > > are used for a double aligned at 4 bytes boundaries. There are 2
> > > (or
> > > other
> > > values) byte floats, but these are usually only in software, and
> > > would be
> > > too much for lightning, which does not have any kind of soft
> > > float
> > > support.
> > 
> > I know that on MIP32r2 for instance you have LWL/LWR/SWL/SWR for
> > unaligned accesses, but I'm not aware of any mechanism to
> > load/store
> > floating-point on unaligned boundaries. You can't even load it into
> > a
> > GPR, because (at least on MIPS) there would be no way to transfer
> > that
> > value into a FPR. So I'd drop the _f/_d variants.
> 
>   There are MFC1, MTC1, DMFC1 and DMTC. Load in an GPR then
> move the bits "as is" in the GPR to/from the FPR. For consistency,
> This
> would also require making public  jit_movr_f_w, jit_movr_w_f, and for
> 32
> bit jit_movr_d_ww and jit_movr_ww_d or for 64 bit jot_movr_d_w and
> jit_movr_w_d.
> 
> > Unrelated, but it's a bit confusing to have "ext" and "extr"
> > instructions, could we maybe find a better name?
> > "jit_extbr" for "extract bits"
> > or "jit_maskr" for "extract mask"
> > as two random suggestions.
> 
>   The "ext" has been renamed to "extr". The most common naming
> pattern is "extract" and "deposit" bits. They are also somewhat
> similar
> to sign/zero extend. The most confusing one is the pair "extr_ui r0
> r1"
> and "extr_u r0 r1 i0 i1". Renaming now the existing ones to sextr or
> uextr would be worse. So, it is still an option to rename the ones
> not
> yet available in an official release.

I meant renaming the ones introduced recently, yes.

Cheers,
-Paul



reply via email to

[Prev in Thread] Current Thread [Next in Thread]