[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 0/4] target/arm: Reduce overhead of cpu_get_tb_c
From: |
Emilio G. Cota |
Subject: |
Re: [Qemu-devel] [PATCH 0/4] target/arm: Reduce overhead of cpu_get_tb_cpu_state |
Date: |
Thu, 14 Feb 2019 12:05:56 -0500 |
User-agent: |
Mutt/1.9.4 (2018-02-28) |
On Wed, Feb 13, 2019 at 20:06:48 -0800, Richard Henderson wrote:
> We've talked about this before, caching state to reduce the
> amount of computation that happens looking up each TB.
>
> I know that Peter has been concerned that we would not be able to
> reliably maintain all of the places that need to be updates to
> keep this up-to-date.
>
> Well, modulo dirty tricks within linux-user, it appears as if
> exception delivery and return, plus after every TB-ending write
> to a system register is sufficient.
>
> There seems to be a noticable improvement, although wall-time
> is harder to come by -- all of my system-level measurements
> include user input, and my user-level measurements seem to be
> too small to matter.
Thanks for this!
Some SPEC06int user-mode numbers (before vs. after)
aarch64-linux-user speedup for SPEC06int (test set)
Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz
2 +-----------------------------------------+
| |
1.9 |-+.........................a+-+r.......+-|
| +-+ |
| * * |
1.8 |-+..........................*.*........+-|
| +-+ * * |
1.7 |-+.....+-+...............+-+*.*...+-+..+-|
| * * +-+ * ** * +-+ |
1.6 |-+.....*.*..........|....*.**.*+-+*.*..+-|
| * * *|* * ** *+-+* * |
1.5 |-+.....*.*.........*|*...*.**.**.**.*..+-|
| * * +-+ * ** ** ** * |
| * * * * * ** ** ** * |
1.4 |-+.....*.*.........*.*...*.**.**.**.*+-+-|
| * * +-+ * * * ** ** ** ** * |
1.3 |-+.....*.*...+-+...*.*...*.**.**.**.**.*-|
| +-+ * * * * * * * ** ** ** ** * |
1.2 |-+-+...*.*...*.*...*.*...*.**.**.**.**.*-|
| * * * * * * * * * ** ** ** ** * |
| * * * * * *+-+* * * ** ** ** ** * |
1.1 |-*.*...*.*...*.**.**.*...*.**.**.**.**.*-|
| * *+-+* *+-+* ** ** *+-+* ** ** ** ** * |
1 +-----------------------------------------+
400.per401.b40344454462.li464471.483.xalangeomean
png: https://imgur.com/RjkYYJ5
That is, a 1.4x average speedup.
Emilio
- [Qemu-devel] [PATCH 0/4] target/arm: Reduce overhead of cpu_get_tb_cpu_state, Richard Henderson, 2019/02/13
- [Qemu-devel] [PATCH 3/4] target/arm: Assert hflags is correct in cpu_get_tb_cpu_state, Richard Henderson, 2019/02/13
- [Qemu-devel] [PATCH 2/4] target/arm: Rebuild hflags at el changes and MSR writes, Richard Henderson, 2019/02/13
- [Qemu-devel] [PATCH 1/4] target/arm: Split out recompute_hflags et al, Richard Henderson, 2019/02/13
- [Qemu-devel] [PATCH 4/4] target/arm: Rely on hflags correct in cpu_get_tb_cpu_state, Richard Henderson, 2019/02/13
- Re: [Qemu-devel] [PATCH 0/4] target/arm: Reduce overhead of cpu_get_tb_cpu_state, Laurent Desnogues, 2019/02/14
- Re: [Qemu-devel] [PATCH 0/4] target/arm: Reduce overhead of cpu_get_tb_cpu_state, Alex Bennée, 2019/02/14
- Re: [Qemu-devel] [PATCH 0/4] target/arm: Reduce overhead of cpu_get_tb_cpu_state,
Emilio G. Cota <=