[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 0/3] Reorg ppc64 pmu insn counting
From: |
Daniel Henrique Barboza |
Subject: |
Re: [PATCH 0/3] Reorg ppc64 pmu insn counting |
Date: |
Thu, 23 Dec 2021 17:36:30 -0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 |
On 12/23/21 00:01, Richard Henderson wrote:
In contrast to Daniel's version, the code stays in power8-pmu.c,
but is better organized to not take so much overhead.
Before:
32.97% qemu-system-ppc qemu-system-ppc64 [.] pmc_get_event
20.22% qemu-system-ppc qemu-system-ppc64 [.] helper_insns_inc
4.52% qemu-system-ppc qemu-system-ppc64 [.] hreg_compute_hflags_value
3.30% qemu-system-ppc qemu-system-ppc64 [.] helper_lookup_tb_ptr
2.68% qemu-system-ppc qemu-system-ppc64 [.] tcg_gen_code
2.28% qemu-system-ppc qemu-system-ppc64 [.] cpu_exec
1.84% qemu-system-ppc qemu-system-ppc64 [.] pmu_insn_cnt_enabled
After:
8.42% qemu-system-ppc qemu-system-ppc64 [.] hreg_compute_hflags_value
6.65% qemu-system-ppc qemu-system-ppc64 [.] cpu_exec
6.63% qemu-system-ppc qemu-system-ppc64 [.] helper_insns_inc
Thanks for looking this up. I had no idea the original C code was that slow.
This reorg is breaking PMU-EBB tests, unfortunately. These tests are run from
the kernel
tree [1] and I test them inside a pSeries TCG guest. You'll need to apply
patches 9 and
10 of [2] beforehand (they apply cleanly in current master) because they aren't
upstream
yet and EBB needs it.
The tests that are breaking consistently with this reorg are:
back_to_back_ebbs_test.c
cpu_event_pinned_vs_ebb_test.c
cycles_test.c
task_event_pinned_vs_ebb_test.c
The issue here is that these tests exercises different Perf events and aspects
of branching
(e.g. how fast we're detecting a counter overflow, how many times, etc) and I
wasn't able to
find out a fix using your C reorg yet.
With that in mind I decided to post a new version of my TCG rework, with less
repetition and
a bit more concise, to have an alternative that can be used upstream to fix the
Avocado tests.
Meanwhile I'll see if I can get your reorg working with all EBB tests we need.
All things
equal - similar performance, all EBB tests passing - I'd rather stay with your
C code than my
TCG rework since yours doesn't rely on TCG Ops knowledge to maintain it.
Thanks,
Daniel
[1]
https://github.com/torvalds/linux/tree/master/tools/testing/selftests/powerpc/pmu/ebb
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-12/msg00073.html
r~
Richard Henderson (3):
target/ppc: Cache per-pmc insn and cycle count settings
target/ppc: Rewrite pmu_increment_insns
target/ppc: Use env->pnc_cyc_cnt
target/ppc/cpu.h | 3 +
target/ppc/power8-pmu.h | 14 +--
target/ppc/cpu_init.c | 1 +
target/ppc/helper_regs.c | 2 +-
target/ppc/machine.c | 2 +
target/ppc/power8-pmu.c | 230 ++++++++++++++++-----------------------
6 files changed, 108 insertions(+), 144 deletions(-)