|
From: | Thomas Huth |
Subject: | Re: [PATCH 6/6] gitlab-ci.d/buildtest: Disintegrate the build-coroutine-sigaltstack job |
Date: | Mon, 6 Feb 2023 08:44:29 +0100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 |
On 03/02/2023 22.14, Juan Quintela wrote:
Peter Maydell <peter.maydell@linaro.org> wrote:On Fri, 3 Feb 2023 at 15:44, Thomas Huth <thuth@redhat.com> wrote:On 03/02/2023 13.08, Kevin Wolf wrote:Am 03.02.2023 um 12:23 hat Thomas Huth geschrieben:On 30/01/2023 11.58, Daniel P. Berrangé wrote:On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote:We can get rid of the build-coroutine-sigaltstack job by moving the configure flags that should be tested here to other jobs: Move --with-coroutine=sigaltstack to the build-without-defaults job and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job.The biggest user of coroutines is the block layer. So we probably ought to have coroutines aligned with a job that triggers the 'make check-block' for iotests. IIUC, the without-defaults job won't do that. How about, arbitrarily, using either the 'check-system-debian' or 'check-system-ubuntu' job. Those distros are closely related, so getting sigaltstack vs ucontext coverage between them is a good win, and they both trigger the block jobs IIUC.I gave it a try with the ubuntu job, but this apparently trips up the iotests: https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212 Does anybody have a clue what could be going wrong here?I'm not sure how changing the coroutine backend could cause it, but primarily this looks like an assertion failure in migration code. Dave, Juan, any ideas what this assertion checks and why it could be failing?Ah, I think it's the bug that will be fixed by: 20230202160640.2300-2-quintela@redhat.com/">https://lore.kernel.org/qemu-devel/20230202160640.2300-2-quintela@redhat.com/ The fix hasn't hit the master branch yet (I think), and I had another patch in my CI that disables the aarch64 binary in that runner, so the iotests suddenly have been executed with the alpha binary there --> migration fails. So never mind, it will be fixed as soon as Juan's pull request gets included.The migration tests have been flaky for a while now, including setups where host and guest page sizes are the same. (For instance, my x86 macos box pretty reliably sees failures when the machine is under load.)I *thought* that we had fixed all of those. But it is difficult for me to know because: - I only happens when one runs "make check" - running ./migration-test have never failed to me - When it fails (and it has been a while since it has failed to me) it is impossible to me to detect what is going on, and as said, I have never been able to reproduce running only migration-test. I will try to run several at the same time and see if it happens. And as Thomas said, I *think* that the fix that Peter Xu posted should fix this issue. Famous last words.
The patch from Peter should fix my problems that I triggered via the iotests - but the migration-qtest is still unstable independent from that issue, I think. See for example the latest staging pipeline:
https://gitlab.com/qemu-project/qemu/-/pipelines/767961842The migration qtest failed in both, the x86-freebsd-build and the ubuntu-20.04-s390x-all pipelin.
Thomas
[Prev in Thread] | Current Thread | [Next in Thread] |