qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH 0/2] tcg: Merge two sequential labels


From: Taylor Simpson
Subject: RE: [PATCH 0/2] tcg: Merge two sequential labels
Date: Sat, 4 Mar 2023 00:27:57 +0000


> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Friday, March 3, 2023 3:31 PM
> To: qemu-devel@nongnu.org
> Cc: anjo@rev.ng; Taylor Simpson <tsimpson@quicinc.com>
> Subject: [PATCH 0/2] tcg: Merge two sequential labels
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
> 
> Supercedes: 20230301142221.24495-1-anjo@rev.ng
> ("[PATCH] tcg: `reachable_code_pass()` remove empty else-branch")
> 
> Anton has probably seen this already but his patch now has conflicts with
> master.  Rather than a while loop looking back across labels, let's fold all
> adjacent labels into a single label, at which point the current branch-to-next
> optimization works as is.
> 
> Perhaps I missed it, but I didn't see how to reproduce the issue being
> resolved by the hexagon front end?  So I haven't this trigger, but it passes
> the testsuite.  :-)
> 
> 
> r~

Hi Richard,

Thanks!!

You need this series of patches to see this pattern from Hexagon
https://patchew.org/QEMU/20230131225647.25274-1-tsimpson@quicinc.com/

I have an update to that series for your tcg_temp_local_* series, but I'm 
wondering if I should wait for your tcg_temp_free patches to land first.  
Please advise.

Anyway, I tested your patch, and can see the optimization happening.  Here is 
the input code and unoptimized TCG:

----------------
IN: 
0x00400094:  0xf900c102 {       if (P0) R2 = and(R0,R1) }

OP:
 ld_i32 loc0,env,$0xfffffffffffffff0
 brcond_i32 loc0,$0x0,lt,$L0

 ---- 00400094
 and_i32 loc2,p0,$0x1
 brcond_i32 loc2,$0x0,eq,$L1
 and_i32 loc4,r0,r1
 mov_i32 r2,loc4
 br $L2
 set_label $L1
 set_label $L2
 add_i32 pkt_cnt,pkt_cnt,$0x1
 add_i32 insn_cnt,insn_cnt,$0x1
 mov_i32 pc,$0x400098
 exit_tb $0x0
 set_label $L0
 exit_tb $0x7fea80000043


Here is the optimized TCG after your patches.  The two labels have been merged, 
and the unnecessary branch has been removed.

OP after optimization and liveness analysis:
 ld_i32 tmp0,env,$0xfffffffffffffff0      dead: 1  pref=0xffff
 brcond_i32 tmp0,$0x0,lt,$L0              dead: 0

 ---- 00400094                         
 and_i32 tmp2,p0,$0x1                     dead: 1 2  pref=0xffff
 brcond_i32 tmp2,$0x0,eq,$L2              dead: 0 1
 and_i32 tmp4,r0,r1                       dead: 1 2  pref=0xffff
 mov_i32 r2,tmp4                          sync: 0  dead: 0 1  pref=0xffff
 set_label $L2                          
 add_i32 pkt_cnt,pkt_cnt,$0x1             sync: 0  dead: 0 1  pref=0xffff
 add_i32 insn_cnt,insn_cnt,$0x1           sync: 0  dead: 0 1 2  pref=0xffff
 mov_i32 pc,$0x400098                     sync: 0  dead: 0 1  pref=0xffff
 exit_tb $0x0                           
 set_label $L0                          
 exit_tb $0x7fea80000043


Tested-by: Taylor Simpson <tsimpson@quicinc.com>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]