qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: powernv gitlab ci regression


From: Daniel Henrique Barboza
Subject: Re: powernv gitlab ci regression
Date: Mon, 20 Dec 2021 23:37:44 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.3.0

Hey,

On 12/20/21 18:35, Richard Henderson wrote:
Hi guys,

Somewhere within

Merge tag 'pull-ppc-20211217' of https://github.com/legoater/qemu into staging
ppc 7.0 queue:

* General cleanup for Mac machines (Peter)
* Fixes for FPU exceptions (Lucas)
* Support for new ISA31 instructions (Matheus)
* Fixes for ivshmem (Daniel)
* Cleanups for PowerNV PHB (Christophe and Cedric)
* Updates of PowerNV and pSeries documentation (Leonardo and Daniel)
* Fixes for PowerNV (Daniel)
* Large cleanup of FPU implementation (Richard)
* Removal of SoftTLBs support for PPC74x CPUs (Fabiano)
* Fixes for exception models in MPCx and 60x CPUs (Fabiano)
* Removal of 401/403 CPUs (Cedric)
* Deprecation of taihu machine (Thomas)
* Large rework of PPC405 machine (Cedric)
* Fixes for VSX instructions (Victor and Matheus)
* Fix for e6500 CPU (Fabiano)
* Initial support for PMU (Daniel)

is something that has caused a timeout regression in avocado-system-centos:

 (047/171) 
tests/avocado/boot_linux_console.py:BootLinuxConsole.test_ppc_powernv8:  
INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout 
reached\nOriginal status: ERROR\n{'name': 
'047-tests/avocado/boot_linux_console.py:BootLinuxConsole.test_ppc_powernv8', 
'logdir': 
'/builds/qemu-project/qemu/build/tests/results/job-2021-12-17T19.23-... (90.46 
s)
 (048/171) 
tests/avocado/boot_linux_console.py:BootLinuxConsole.test_ppc_powernv9:  
INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout 
reached\nOriginal status: ERROR\n{'name': 
'048-tests/avocado/boot_linux_console.py:BootLinuxConsole.test_ppc_powernv9', 
'logdir': 
'/builds/qemu-project/qemu/build/tests/results/job-2021-12-17T19.23-... (90.55 
s)

See e.g. https://gitlab.com/qemu-project/qemu/-/jobs/1898304074

Thanks for letting us know. I bisected it and the culprit is this patch:


commit 4db3907a40a087e2cc1839d19a3642539d36610b
Author: Daniel Henrique Barboza <danielhb413@gmail.com>
Date:   Fri Dec 17 17:57:18 2021 +0100

    target/ppc: enable PMU instruction count


This is a patch where I added instruction count in the ppc64 PMU. After this 
patch the
performance of these 2 tests are degraded to the point where we're hitting 
timeouts in
gitlab (didn't hit timeouts in my machine but the performance is noticeable 
worse).

I'll need to see the serial console of the VM booting up to evaluate if there's 
some kernel
module during boot time that is using the PMU and causing the delay. I'll also 
take a look
into improving the performance as well (e.g. using more TCG code and avoid 
calling helpers).

It might be the case that the performance gain is enough to make these tests 
happy again,
although my initial guess is that there's something during boot that is 
starting the PMU and
leaving it running.


Thanks,


Daniel










Timeouts are especially tedious with gitlab, because they're not usually 
consistent, and often go away with a retry.  If I don't see the same failure on 
my local machine, I often let it go.

But in this case, the gitlab ci regression has been consistent, not passing a 
single time since.  Which makes me think this is not just a ci artifact, but 
that there's a real slowdown.  Could you please have a look?


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]