Hi Anders,
I'm not well versed on tuxrun, and how to make that work with a qemu
binary outside of the container, so I'm not sure if I'm comparing
apples to bananas. Can you look and see if this fixes the kselftest
slowdown you reported?
Anyway, for a boot and shutdown of your rootfs, I see:
Before:
11.13% [.] aa64_va_parameters
8.38% [.] helper_lookup_tb_ptr
7.37% [.] pauth_computepac
3.79% [.] qht_lookup_custom
After:
9.17% [.] helper_lookup_tb_ptr
8.05% [.] pauth_computepac
4.22% [.] qht_lookup_custom
3.68% [.] pauth_addpac
...
1.67% [.] aa64_va_parameters
This is all due to the heavy use pauth makes of aa64_va_parameters.
It "only" needs 2 parameters, tsz and tbi, but tsz is probably the
most expensive part of aa64_va_parameters -- do anything about that
and we might as well cache the whole thing.
The change from struct+bitfields to uint32_t+FIELD is meant to combat
some really ugly code that gcc produced. Seems like they should have
compiled to the same thing, more or less, but alas.
r~
Richard Henderson (4):
target/arm: Flush only required tlbs for TCR_EL[12]
target/arm: Store tbi for both insns and data in ARMVAParameters
target/arm: Use FIELD for ARMVAParameters
target/arm: Cache ARMVAParameters
target/arm/cpu.h | 30 +++++++
target/arm/internals.h | 21 +----
target/arm/helper.c | 177 ++++++++++++++++++++++++++++----------
target/arm/pauth_helper.c | 39 +++++----
target/arm/ptw.c | 57 ++++++------
5 files changed, 217 insertions(+), 107 deletions(-)