lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Optimizing byte-swap-and-store on PPC


From: Paulo César Pereira de Andrade
Subject: Re: Optimizing byte-swap-and-store on PPC
Date: Thu, 9 Feb 2023 14:06:17 -0300

Em qui., 9 de fev. de 2023 às 13:05, Paul Cercueil
<paul@crapouillou.net> escreveu:
>
> Le jeudi 09 février 2023 à 12:46 -0300, Paulo César Pereira de Andrade
> a écrit :
> > Em qui., 9 de fev. de 2023 às 11:20, Paul Cercueil
> > <paul@crapouillou.net> escreveu:
> > >
> > > Le jeudi 09 février 2023 à 09:49 -0300, Paulo César Pereira de
> > > Andrade
> > > a écrit :
> > > > Em qui., 9 de fev. de 2023 às 08:18, Paul Cercueil
> > > > <paul@crapouillou.net> escreveu:
> > > > >
> > > > > Hi Paulo,
> > > >
> > > >   Hi Paul,
> > > >
> > > > > If you remember, I added an optimization (for my use case
> > > > > anyway)
> > > > > for
> > > > > the following code:
> > > > >
> > > > > jit_ldr(rY, rX);
> > > > > jit_bswapr(rY, rY);
> > > > >
> > > > > Lightning will only generate a single LWBRX instruction for
> > > > > this.
> > > > >
> > > > > Now I would like to do the same for stores. The problem is that
> > > > > the
> > > > > pattern typically use a temporary register T:
> > > > >
> > > > > jit_bswapr(T, rX);
> > > > > jit_str(rY, T);
> > > > >
> > > > > How can I know that the register T is dead after this pattern?
> > > >
> > > >   Depends a bit on how T value is resolved.
> > > >
> > > >   If from jit_get_reg() it is live until the jit_unget_reg()
> > > > call.
> > > > You
> > > > should avoid branches that cannot be followed in such cases.
> > > >
> > > >   Bellow is explanation when an explicit value, not from
> > > > jit_get_reg().
> > >
> > > In my case it is with an explicit value, yes.
> > >
> > > >   It is considered live in the range it is set until the last use
> > > > of
> > > > the
> > > > register.
> > > >   It is considered dead in the range from last use (but not set)
> > > > until the next value change. Branches might cause it to be in
> > > > a state merged as live due to use in other paths.
> > > >   The liveness range is broken if it is not a callee save
> > > > register
> > > > and when following the code it finds a function call or an
> > > > indirect
> > > > branch to a register or non jit address.
> > > >   If it is a callee save register it is always considered live.
> > >
> > > I myself do know when the registers are dead, if I look at the code
> > > -
> > > that was not my question. What I'm wondering, is whether it is
> > > possible
> > > to know if a given public register (as in JIT_V/JIT_R) is dead, at
> > > a
> > > specific point in the code generation step within Lightning itself.
> > >
> > > Something like a (private)
> > > "bool reg_is_dead(jit_node_t node, jit_int32_t reg)" function.
> > >
> > > The code generator for "jit_str" (and related) would then convert
> > > the
> > > jit_bswapr(T, rX) + jit_str(rY, T) to a single "STBRX rY, rX"
> > > instruction, if we know that T is dead after this point.
> >
> >   This should be done at code generation time. It probably would
> > require some extra steps, but the core logic should be to call the
> > macro in include/lightning/jit_private.h:
> >
> > #define jit_reg_free_p(regno)                        \
> >     (!jit_regset_tstbit(&_jitc->reglive, regno) &&            \
> >      !jit_regset_tstbit(&_jitc->regarg, regno) &&            \
> >      !jit_regset_tstbit(&_jitc->regsav, regno))
> >
> > Extra steps should be required because the jit_str(rY, T) is the
> > start
> > of a live range for T, so, need to better understand the conditions.
> > Basically, it is live until next time T is used as source operand, or
> > dead if not used as source operand until an instruction that uses T
> > as output/result.
>
> Isn't this information already known at code generation time? My
> assumption was that Lighning would do register liveliness analysis
> before the code generation.

  It only has checkpoints of the state at block entries.  There is no
sense of having bitmaps of registers at instruction entry and exit
point. There are two sets, the known live set and the unknown
state set. The "true" dead set is:
~ ( live | unknown )
  jit_print() does not show dead registers at block entry.
  It is possible to find the state by calling:

jit_update(jit_node_t *node, jit_regset_t *live, jit_regset *mask)

it will follow register states until a label. In the label it does merge
the regset with the state at the next block entry state.

  There is also the optimized jit_reglive() call that updates the
_jitc state at every instruction.

where mask is the unknown state.

  If I understand correctly, the best approach should be to use already
computed states, and not follow branches, either it is dead or give up
on attempting to compute the live state. Need to be done just after handling
the jit_code_bswapr and after the C code:

jit_regarg_set(node, value);;

of the next lightning instruction.

  If this next instruction is some kind of branch, special care should be
taken.

 Pseudo code:

_emit_code() {
...
 if (state == state_str) {
  T = ...; /* somehow figure the value of T, note that need the
symbolic value, not the hard register value */
  if (jit_get_reg(jit_class_gpr|jit_class_named|jit_class_nospill|jit_class_chk)
!= JIT_NOREG) {
      /* T is dead */
      merge_instructions();
  }
   state = state_unknown;
 }
 else if (state != state_bswap)
   state = state_unknown;
  switch (node->code) {
  ...
  case jit_code_bswap:
  ...
  state = state_bswap;
  break;
  ...
  case jit_code_str:
  ...
  if (state == state_bswap)  state = state_str;
  else state = state_unknown;
  break;
  ...

> > > >   There isn't an explicit jit_dead() call because most times it
> > > > would
> > > > allow creating bugs that are difficult to debug. There is only a
> > > > jive_live() call to allow telling a non callee save register is
> > > > live
> > > > at the point of an indirect branch or at a label entry, usually
> > > > a point where an indirect branch lands, or some function returns
> > > > a value in a non standard way.
> > > >
> > > >   The easiest way to see what is happening is call jit_print()
> > > > and
> > > > see the state it prints at label entry point.
> > > >
> > > >   If this reply is not clear, please provide some sample example
> > > > code or jit_print() output.
> > > >
> > > > > Cheers,
> > > > > -Paul
> > > >
> > > > Thanks,
> > > > Paulo
> > >
> > > Cheers,
> > > -Paul
>
> -Paul



reply via email to

[Prev in Thread] Current Thread [Next in Thread]