Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atom

From:	Emilio G. Cota
Subject:	Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically
Date:	Thu, 4 Oct 2018 00:01:47 -0400
User-agent:	Mutt/1.9.4 (2018-02-28)

On Wed, Oct 03, 2018 at 16:04:54 -0400, Emilio G. Cota wrote:
> Updates can come from other threads, so readers that do not
> take tlb_lock must use atomic_read to avoid undefined
> behaviour (UB).
> 
> This and the previous commit result in a small performance decrease,
> but this is a fair price for removing UB.
(snip)
> That is, a ~2% slowdown for the aarch64 bootup+shutdown test.

I've run more tests. This slowdown is much more pronounced on
memory-heavy workloads. These are the numbers for SPEC06int:

                                Speedup over master

  1.05 +-+--+----+----+----+----+----+----+---+----+----+----+----+----+--+-+
       |                                 +++  ||      +++                   |
       |tlb-lock-noatomic      +++        |  **|       |+++                 |
       |          +atomic       |  ++++   |  **##      | |                  |
     1 +-+..+++...............++##.***#...|..**|#......**|................+-+
       |    ###     ***++     ***# *+*# +++  **+#  +++ **##                 |
       |    # #     *+*#      *|*# *+*#  ||  ** # **## **|#                 |
       |    # #     * *#+     *+*# * *#  ||  ** # **+#+**|#     +**  ++###  |
  0.95 +-+..#.#.....*.*#......*.*#.*.*#.***#.**.#.**.#.**|#......**##***+#+-+
       |    # #     * *#      * *# * *# *|*# ** # ** # **+#      **+#* * #  |
       |    # #     * *#      * *# * *# *|*# ** # ** # ** #+++++ ** #* * #  |
   0.9 +-+***.#..+++*.*#......*.*#.*.*#.*+*#.**.#.**.#.**.#+**|..**.#*.*.#+-+
       |  * * #***##* *#      * *# * *# * *# ** # ** # ** # **## ** #* * #  |
       |  * * #* *+#* *#   +++* *# * *# * *# ** # ** # ** # **|# ** #* * #  |
       |  * * #* * #* *# ***# * *# * *# *+*# ** # ** # ** # **+# ** #* * #  |
  0.85 +-+*.*.#*.*.#*.*#.*.*#+*.*#.*.*#.*.*#.**.#.**.#.**.#.**.#.**.#*.*.#+-+
       |  * * #* * #* *# * *# * *# * *# * *# ** # ** # ** # ** # ** #* * #  |
       |  * * #* * #* *# * *# * *# * *# * *# ** # ** # ** # ** # ** #* * #  |
       |  * * #* * #* *# * *# * *# * *# * *# ** # ** # ** # ** # ** #* * #  |
   0.8 +-+***##***##***#-***#-***#-***#-***#-**##-**##-**##-**##-**##***##+-+
        401.bzi403.g429445.g456.462.libq464.h471.omn4483.xalancbgeomean

That is, a 5% average slowdown, with a max slowdown of ~14% for
mcf :-(

I'll profile tomorrow and see where the slowdown comes from.
If the lock is the issue, we might be better off shifting
all the work to the cross-vCPU call (e.g. doing a round of
synchronous cross-vCPU calls via run_on_cpu), if the assumption
that those calls are very rare is correct.

                Emilio

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v2 0/4] per-TLB lock, Emilio G. Cota, 2018/10/03
- [Qemu-devel] [PATCH v2 1/4] exec: introduce tlb_init, Emilio G. Cota, 2018/10/03
  - Re: [Qemu-devel] [PATCH v2 1/4] exec: introduce tlb_init, Alex Bennée, 2018/10/04
- [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically, Emilio G. Cota, 2018/10/03
  - Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically, Emilio G. Cota <=
    - Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically, Emilio G. Cota, 2018/10/04
- [Qemu-devel] [PATCH v2 2/4] cputlb: fix assert_cpu_is_self macro, Emilio G. Cota, 2018/10/03
  - Re: [Qemu-devel] [PATCH v2 2/4] cputlb: fix assert_cpu_is_self macro, Richard Henderson, 2018/10/03
  - Re: [Qemu-devel] [PATCH v2 2/4] cputlb: fix assert_cpu_is_self macro, Alex Bennée, 2018/10/04
- [Qemu-devel] [PATCH v2 3/4] cputlb: serialize tlb updates with env->tlb_lock, Emilio G. Cota, 2018/10/03
  - Re: [Qemu-devel] [PATCH v2 3/4] cputlb: serialize tlb updates with env->tlb_lock, Alex Bennée, 2018/10/04
- Re: [Qemu-devel] [PATCH v2 0/4] per-TLB lock, Alex Bennée, 2018/10/04

Prev by Date: [Qemu-devel] [PULL 0/4] Python queue, 2018-10-03
Next by Date: Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically
Previous by thread: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically
Next by thread: Re: [Qemu-devel] [PATCH v2 4/4] cputlb: read CPUTLBEntry.addr_write atomically
Index(es):
- Date
- Thread