qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] tcg/ppc: Optimize 26-bit jumps


From: Richard Henderson
Subject: Re: [PATCH] tcg/ppc: Optimize 26-bit jumps
Date: Thu, 8 Sep 2022 22:44:18 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0

On 9/8/22 22:18, Leandro Lupori wrote:
PowerPC64 processors handle direct branches better than indirect
ones, resulting in less stalled cycles and branch misses.

However, PPC's tb_target_set_jmp_target() was only using direct
branches for 16-bit jumps, while PowerPC64's unconditional branch
instructions are able to handle displacements of up to 26 bits.
To take advantage of this, now jumps whose displacements fit in
between 17 and 26 bits are also converted to direct branches.

This doesn't work because you have to be able to unset the jump as well, and your two step sequence doesn't handle that. (You wind up with the two insn address load reset, but the jump continuing to the previous target -- boom.)

For v2.07+, you could use stq to update 4 insns atomically.

For v3.1+, you can eliminate TCG_REG_TB, using prefixed pc-relative addressing instead. Which brings you back to only needing to update 8 bytes atomically (select either paddi to compute address to feed to following mtctr+bcctr, or direct branch + nop leaving the mtctr+bcctr alone and unreachable).

(Actually, there are lots of updates one could make to tcg/ppc for v3.1...)


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]