Re: [PATCH v2 0/9] mips: Fill delay slots v3

lightning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 0/9] mips: Fill delay slots v3

From:	Paul Cercueil
Subject:	Re: [PATCH v2 0/9] mips: Fill delay slots v3
Date:	Wed, 18 Jan 2023 12:42:22 +0000

Hi Paulo,

Le dimanche 15 janvier 2023 à 09:54 -0300, Paulo César Pereira de
Andrade a écrit :
> Em dom., 15 de jan. de 2023 às 08:05, Paul Cercueil
> <paul@crapouillou.net> escreveu:
> > 
> > Hi Paulo,
> 
>   Hi Paul,
> 
> > Le samedi 14 janvier 2023 à 22:42 -0300, Paulo César Pereira de
> > Andrade
> > a écrit :
> > > Em sáb., 14 de jan. de 2023 às 12:11, Paul Cercueil
> > > <paul@crapouillou.net> escreveu:
> > > > 
> > > > Hi Paulo,
> > > 
> > >   Hi Paul,
> > > 
> > >   Patches pushed as all tests pass.
> > 
> > Great! ;)
> > 
> > > > Here's the V3 of my patchset that attempts to fill branch delay
> > > > slots on
> > > > MIPS.
> > > > 
> > > > All tests do pass now, except the "float.nodata" check, but
> > > > this
> > > > one
> > > > fails in master as well.
> > > 
> > >   What environment are you using for testing? It works in the
> > > mips
> > > environments I am testing.
> > > 
> > >   Please let me know the output of:
> > > 
> > > $ cpp -dM </dev/null
> > 
> > Just a mips64 toolchain built using buildroot.
> > 
> > I believe you meant
> > cpp -dM -E - < /dev/null
> 
>   It is a very recent gcc, and probably kernel. My guess is that the
> issue
> is caused by some kind of unaligned access. Is it real hardware or
> some
> emulator?
> 
>   Can you please run it as:
> 
> $ cd check
> $ make debug
> ...
> (gdb) r -d float.tst
> 
> and see what if it is just somehow creating a bad immediate or if
> it is triggering some kind of fault, like misaligned or invalid
> instruction?

This was under Qemu with my MIPS64 toolchain.

However today as I try to reproduce it, all the checks do pass. Maybe a
system update on my PC brought a fix (to qemu), or maybe it does happen
only under some conditions.

I'll debug it if I encounter it again.

> > See attachment.
> > 
> > > > I implemented your suggestion and it worked just fine. There
> > > > was
> > > > one
> > > > last problem though, which caused the "clobber" check to fail
> > > > once
> > > > again; code emitters that generated branches to the next
> > > > instruction (to conditionally execute some opcodes) caused
> > > > problems, as
> > > > the next instruction was not detected as a jump target (no
> > > > "jit_flag_patch" set). I solved this by just checking the
> > > > previous
> > > > node's code against a table of accepted values.
> > > 
> > >   Not certain if I fully understand this. It also appears to be a
> > > somewhat
> > > fragile logic.
> > > 
> > >   Everytime the delay slot is used, there should be a comment
> > > telling that instruction is in the delay slot, for example:
> > > 
> > > static void
> > > _ltr_f(jit_state_t *_jit, jit_int32_t r0, jit_int32_t r1,
> > > jit_int32_t
> > > r2)
> > > {
> > >     jit_word_t        w;
> > >     C_OLT_S(r1, r2);
> > >     w = _jit->pc.w;
> > >     BC1T(0);
> > >     /* delay slot */
> > >     movi(r0, 1);
> > >     movi(r0, 0);
> > >     patch_at(w, _jit->pc.w);
> > > }
> > 
> > There is a comment indeed, but the comment is gone as soon as you
> > compile the code.
> > 
> > If you have a 'ltr_f' followed by a 'beqr', the code will try to
> > find a
> > delay slot for the 'beqr', and will detect that the 'movi(r0, 0)'
> > (aka.
> > the last opcode generated by 'ltr_f') can safely be moved, because
> > there is nothing that can tell us that the last movi is a jump
> > target.
> > 
> > So I just blacklist 'ltr_f' as well as others so that the emitted
> > code
> > will always be correct.
> 
>   I understand now.  The problem I see is that it is not very clear
> from
> the patch, and/or changes elsewhere could cause it to no longer be
> required, or have some missing check if new lightning instructions
> are added.
> 
>   The problem is that can_swap_ds() does not check the pattern of a
> jump in 2 (for fpr comparison) or 3 (for compare-and-swap)
> instructions
> before the last machine instruction that it attempts to move.

Yes, it only checks the last instruction. Going further than that would
increase the complexity of the code, I think.

>   Currently there isn't a clear way to pass parameters to jit
> generation.
>   The closest is the jit_cpu_t for arm and x86. Maybe we could add a
> jit_cpu_t for mips, and have an option to make the swap with delay
> slot optional? This would make it easy to validate that if a bug is
> found
> in the future, it can be easily verified that it is not a side effect
> of the
> changes in these patches. That is, have a defined jit_cpu_t for mips
> in include/lightning/jit_mips.h and then in
> lib/jit_mips.c:jit_get_cpu()
> set it to a default value, and use the flag allowing or disallowing
> the
> patch during jit generation.

That should be doable, yes. But what would set the value of the flag?
Would it be a ./configure option?

> > > so, you check the previous opcode for float comparison and
> > > cas{r,i}.
> > > Shouldn't it also check for b{o,x}{add,sub}{r,i} codes? These
> > > also
> > > end with a delay slot usage, and based on the patches and your
> > > description, not checking them might create very difficult to
> > > debug
> > > bugs in jit doing complex branch logic on carry/overflow.
> > 
> > The b{o,x}{add,sub}{r,i} end up with a delay slot, and that's
> > perfectly
> > fine. The algorithm will detect that and won't try to move the
> > opcode
> > (that's the has_delay_slot() check in can_swap_ds()).
> > 
> > Cheers,
> > -Paul
> 
> Thanks,
> Paulo

Cheers,
-Paul

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v3 2/8] mips: Fill delay slots of JR opcodes in jit_jmpr, (continued)
- [PATCH v3 2/8] mips: Fill delay slots of JR opcodes in jit_jmpr, Paul Cercueil, 2023/01/14
- [PATCH v3 3/8] mips: Fill delay slots of JALR opcodes in jit_callr, Paul Cercueil, 2023/01/14
- [PATCH v3 4/8] mips: Fill delay slots of J in jit_jmpi, Paul Cercueil, 2023/01/14
- [PATCH v3 5/8] mips: Fill delay slots in jit_beqr / jit_beqi, Paul Cercueil, 2023/01/14
- [PATCH v3 6/8] mips: Fill delay slots in jit_bner / jit_bnei, Paul Cercueil, 2023/01/14
- [PATCH v3 7/8] mips: Fill delay slots in jit_bgtr, jit_bgti, jit_bler, jit_blei, Paul Cercueil, 2023/01/14
- [PATCH v3 8/8] mips: Fill delay slots in jit_bger, jit_bgei, jit_bltr, jit_blti, Paul Cercueil, 2023/01/14
- Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/14
  - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/15
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/15
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil <=
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/18
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/20
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paul Cercueil, 2023/01/20
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20
    - Re: [PATCH v2 0/9] mips: Fill delay slots v3, Paulo César Pereira de Andrade, 2023/01/20

Prev by Date: Re: [PATCH v2 0/9] mips: Fill delay slots v3
Next by Date: Re: [PATCH v2 0/9] mips: Fill delay slots v3
Previous by thread: Re: [PATCH v2 0/9] mips: Fill delay slots v3
Next by thread: Re: [PATCH v2 0/9] mips: Fill delay slots v3
Index(es):
- Date
- Thread