Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020

From:	Philippe Mathieu-Daudé
Subject:	Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020
Date:	Mon, 14 Sep 2020 15:05:25 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0

On 9/14/20 2:43 PM, Aleksandar Markovic wrote:
> On Mon, Sep 14, 2020 at 12:52 PM Ahmed Karaman
> <ahmedkhaledkaraman@gmail.com <mailto:ahmedkhaledkaraman@gmail.com>> wrote:
> 
>     On Mon, Sep 14, 2020 at 8:46 AM Philippe Mathieu-Daudé
>     <f4bug@amsat.org <mailto:f4bug@amsat.org>> wrote:
>     >
>     > Hi Ahmed,
>     >
>     > On 9/14/20 12:07 AM, Ahmed Karaman wrote:
>     > > Host CPU         : Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
>     > > Host Memory      : 15.49 GB
>     > >
>     > > Start Time (UTC) : 2020-09-13 21:35:01
>     > > End Time (UTC)   : 2020-09-13 22:07:44
>     > > Execution Time   : 0:32:42.230467
>     > >
>     > > Status           : SUCCESS
>     > >
>     > > Note:
>     > > Changes denoted by '-----' are less than 0.01%.
>     > >
>     > > --------------------------------------------------------
>     > >             SUMMARY REPORT - COMMIT f00f57f3
>     > > --------------------------------------------------------
>     >
>     > (Maybe this was already commented earlier but I missed it).
>     >
>     > What change had a so significant impact on the m68k target?
>     > At a glance I only see mostly changes in softfloat:
>     >
>     > $ git log --oneline v5.1.0..f00f57f3 tcg target/m68k fpu
>     > fe4b0b5bfa9 tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem
>     > 6a17646176e tcg: Eliminate one store for in-place 128-bit dup_mem
>     > e7e8f33fb60 tcg: Fix tcg gen for vectorized absolute value
>     > 5ebf5f4be66 softfloat: Define misc operations for bfloat16
>     > 34f0c0a98a5 softfloat: Define convert operations for bfloat16
>     > 8282310d853 softfloat: Define operations for bfloat16
>     > 0d93d8ec632 softfloat: Add fp16 and uint8/int8 conversion functions
>     > fbcc38e4cb1 softfloat: add xtensa specialization for pickNaNMulAdd
>     > 913602e3ffe softfloat: pass float_status pointer to pickNaN
>     > cc43c692511 softfloat: make NO_SIGNALING_NANS runtime property
>     > 73ebe95e8e5 target/ppc: add vmulld to INDEX_op_mul_vec case
>     >
>     > > --------------------------------------------------------
>     > > --------------------------------------------------------
>     > > Test Program: matmult_double
>     > > --------------------------------------------------------
>     > > Target              Instructions      Latest      v5.1.0
>     > > ----------  --------------------  ----------  ----------
>     > > aarch64            1 412 412 599       -----     +0.311%
>     > > alpha              3 233 957 639       -----     +7.472%
>     > > arm                8 545 302 995       -----      +1.09%
>     > > hppa               3 483 527 330       -----     +4.466%
>     > > m68k               3 919 110 506       -----    +18.433%
>     > > mips               2 344 641 840       -----     +4.085%
>     > > mipsel             3 329 912 425       -----     +5.177%
>     > > mips64             2 359 024 910       -----     +4.075%
>     > > mips64el           3 343 650 686       -----     +5.166%
>     > > ppc                3 209 505 701       -----     +3.248%
>     > > ppc64              3 287 495 266       -----     +3.173%
>     > > ppc64le            3 287 135 580       -----     +3.171%
>     > > riscv64            1 221 617 903       -----     +0.278%
>     > > s390x              2 874 160 417       -----     +5.826%
>     > > sh4                3 544 094 841       -----      +6.42%
>     > > sparc64            3 426 094 848       -----     +7.138%
>     > > x86_64             1 249 076 697       -----     +0.335%
>     > > --------------------------------------------------------
>     > ...
>     > > --------------------------------------------------------
>     > > Test Program: qsort_double
>     > > --------------------------------------------------------
>     > > Target              Instructions      Latest      v5.1.0
>     > > ----------  --------------------  ----------  ----------
>     > > aarch64            2 709 839 947       -----     +2.423%
>     > > alpha              1 969 432 086       -----     +3.679%
>     > > arm                8 323 168 267       -----     +2.589%
>     > > hppa               3 188 316 726       -----       +2.9%
>     > > m68k               4 953 947 225       -----    +15.153%
>     > > mips               2 123 789 120       -----     +3.049%
>     > > mipsel             2 124 235 492       -----     +3.049%
>     > > mips64             1 999 025 951       -----     +3.404%
>     > > mips64el           1 996 433 190       -----     +3.409%
>     > > ppc                2 819 299 843       -----     +5.436%
>     > > ppc64              2 768 177 037       -----     +5.512%
>     > > ppc64le            2 724 766 044       -----     +5.602%
>     > > riscv64            1 638 324 190       -----     +4.021%
>     > > s390x              2 519 117 806       -----     +3.364%
>     > > sh4                2 595 696 102       -----       +3.0%
>     > > sparc64            3 988 892 763       -----     +2.744%
>     > > x86_64             2 033 624 062       -----     +3.242%
>     > > --------------------------------------------------------
> 
>     Hi Mr. Philippe,
>     The performance degradation from v5.1.0 of all targets, and especially
>     m68k, was introduced between the two nightly tests below:
> 
>     [REPORT] Nightly Performance Tests - Thursday, August 20, 2020:
>     https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg04923.html
> 
>     [REPORT] Nightly Performance Tests - Saturday, August 22, 2020
>     https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05537.html
> 
>     It looks like the new build system is the culprit.
> 
>     The "bisect.py" script introduced during the "TCG Continuous
>     Benchmarking" GSoC project can be very handy in these cases. I wrote
>     about the tool and how to use it in the report below:
>     
> https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/Finding-Commits-Affecting-QEMU-Performance/
> 
> 
> Hi, Ahmed.
> 
> I think the bisect.py script will work only if both "start" and "end"
> commits are before build system change, or if both of them are after
> build system change.
> 
> In other words, the script is unlikely to work if "start" is before, and
> "end" is after build system change.

Good point.

> This means that, most probably, one should resort to manual analysis of
> origins of performance degradation on Aug 22nd.

What would be useful is a report from the build system change
(commit 7fd51e68c34), then as Aleksandar suggested, resume normal
bisection (range 7fd51e68c34..66e01f1cdc9).

> 
> One area that definitely might be the culprit is the difference in
> CFLAGS before and after.
> 
> Yours,
> Aleksandar
>  
> 
>     Best regards,
>     Ahmed Karaman
>

[Prev in Thread]

Current Thread

[Next in Thread]

[REPORT] Nightly Performance Tests - Sunday, September 13, 2020, Ahmed Karaman, 2020/09/13
- Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020, Philippe Mathieu-Daudé, 2020/09/14
  - Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020, Ahmed Karaman, 2020/09/14
    - Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020, Philippe Mathieu-Daudé, 2020/09/14
    - Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020, Aleksandar Markovic, 2020/09/14
    - Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020, Philippe Mathieu-Daudé <=

Prev by Date: Re: [PATCH] add a source path Makefile
Next by Date: Re: [PATCH 9/9] piix4: don't reserve hw resources when hotplug is off globally
Previous by thread: Re: [REPORT] Nightly Performance Tests - Sunday, September 13, 2020
Next by thread: [PATCH v3 00/15] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set
Index(es):
- Date
- Thread