[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug 1921664] Re: Recent update broke qemu-system-riscv64
From: |
Christian Ehrhardt |
Subject: |
[Bug 1921664] Re: Recent update broke qemu-system-riscv64 |
Date: |
Wed, 14 Apr 2021 06:24:54 -0000 |
Also I've rebuilt the most recent master c1e90def01 about ~55 commits newer
than 6.0-rc2.
As in the experiments of Tommy I was unable to reproduce it there.
But with the data from the tests before it is very likely that this is more
likely an accident by having a slightly different timing than a fix (to be
clear I'd appreciate if there is a fix, I'm just unable to derive from this
being good I could e.g. bisect).
export CFLAGS="-O0 -g -fPIC"
../configure --enable-system --disable-xen --disable-werror --disable-docs
--disable-libudev --disable-guest-agent --disable-sdl --disable-gtk
--disable-vnc --disable-xen --disable-brlapi --disable-hax --disable-vde
--disable-netmap --disable-rbd --disable-libiscsi --disable-libnfs
--disable-smartcard --disable-libusb --disable-usb-redir --disable-seccomp
--disable-glusterfs --disable-tpm --disable-numa --disable-opengl
--disable-virglrenderer --disable-xfsctl --disable-slirp --disable-blobs
--disable-rdma --disable-pvrdma --disable-attr --disable-vhost-net
--disable-vhost-vsock --disable-vhost-scsi --disable-vhost-crypto
--disable-vhost-user --disable-spice --disable-qom-cast-debug --disable-bochs
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2
--disable-nettle --disable-gnutls --disable-capstone --enable-tools
--disable-libssh --disable-libpmem --disable-cap-ng --disable-vte
--disable-iconv --disable-curses --disable-linux-aio --disable-linux-io-uring
--disable-kvm --disable-replication --audio-drv-list="" --disable-vhost-kernel
--disable-vhost-vdpa --disable-live-block-migration --disable-keyring
--disable-auth-pam --disable-curl --disable-strip --enable-fdt
--target-list="riscv64-softmmu"
make -j10
Just like the package build that configures as
coroutine backend: ucontext
coroutine pool: YES
5/5 runs with that were ok
But since we know it is racy I'm unsure if that implies much :-/
P.S. I have not yet went into a build-option bisect, but chances are it could be
related. But that is too much stabbing in the dark, maybe someone experienced
in the coroutines code can already make sense of all the info we have gathered
so
far.
I'll update the bug description and add an upstream task to have all the info
we have get mirrored to the qemu mailing lists.
** Summary changed:
- Recent update broke qemu-system-riscv64
+ Coroutines are racy for risc64 emu on arm64 - crash on Assertion
** Description changed:
+ Note: this could as well be "riscv64 on arm64" for being slow@slow and affect
+ other architectures as well.
+
+ The following case triggers on a Raspberry Pi4 running with arm64 on
+ Ubuntu 21.04 [1][2]. It might trigger on other environments as well,
+ but that is what we have seen it so far.
+
+ $ wget
https://github.com/carlosedp/riscv-bringup/releases/download/v1.0/UbuntuFocal-riscv64-QemuVM.tar.gz
+ $ tar xzf UbuntuFocal-riscv64-QemuVM.tar.gz
+ $ ./run_riscvVM.sh
+ (wait ~2 minutes)
+ [ OK ] Reached target Local File Systems (Pre).
+ [ OK ] Reached target Local File Systems.
+ Starting udev Kernel Device Manager...
+ qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57:
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
+
+ This is often, but not 100% reproducible and the cases differ slightly we
+ see either of:
+ - qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57:
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
+ - qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one:
Assertion `qemu_coroutine_self() == pool->main_co' failed.
+
+ Rebuilding working cases has shown to make them fail, as well as rebulding
+ (or even reinstalling) bad cases has made them work. Also the same builds on
+ different arm64 CPUs behave differently. TL;DR: The full list of conditions
+ influencing good/bad case here are not yet known.
+
+ [1]:
https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#1-overview
+ [2]:
http://cdimage.ubuntu.com/daily-preinstalled/pending/hirsute-preinstalled-desktop-arm64+raspi.img.xz
+
+
+ --- --- original report --- ---
+
I regularly run a RISC-V (RV64GC) QEMU VM, but an update a few days ago
broke it. Now when I launch it, it hits an assertion:
-
- OpenSBI v0.6
- ____ _____ ____ _____
- / __ \ / ____| _ \_ _|
- | | | |_ __ ___ _ __ | (___ | |_) || |
- | | | | '_ \ / _ \ '_ \ \___ \| _ < | |
- | |__| | |_) | __/ | | |____) | |_) || |_
- \____/| .__/ \___|_| |_|_____/|____/_____|
- | |
- |_|
-
+ OpenSBI v0.6
+ ____ _____ ____ _____
+ / __ \ / ____| _ \_ _|
+ | | | |_ __ ___ _ __ | (___ | |_) || |
+ | | | | '_ \ / _ \ '_ \ \___ \| _ < | |
+ | |__| | |_) | __/ | | |____) | |_) || |_
+ \____/| .__/ \___|_| |_|_____/|____/_____|
+ | |
+ |_|
+
...
- Found /boot/extlinux/extlinux.conf
- Retrieving file: /boot/extlinux/extlinux.conf
- 618 bytes read in 2 ms (301.8 KiB/s)
- RISC-V Qemu Boot Options
- 1: Linux kernel-5.5.0-dirty
- 2: Linux kernel-5.5.0-dirty (recovery mode)
- Enter choice: 1: Linux kernel-5.5.0-dirty
- Retrieving file: /boot/initrd.img-5.5.0-dirty
- qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one:
Assertion `qemu_coroutine_self() == pool->main_co' failed.
+ Found /boot/extlinux/extlinux.conf
+ Retrieving file: /boot/extlinux/extlinux.conf
+ 618 bytes read in 2 ms (301.8 KiB/s)
+ RISC-V Qemu Boot Options
+ 1: Linux kernel-5.5.0-dirty
+ 2: Linux kernel-5.5.0-dirty (recovery mode)
+ Enter choice: 1: Linux kernel-5.5.0-dirty
+ Retrieving file: /boot/initrd.img-5.5.0-dirty
+ qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one:
Assertion `qemu_coroutine_self() == pool->main_co' failed.
./run.sh: line 31: 1604 Aborted (core dumped)
qemu-system-riscv64 -machine virt -nographic -smp 8 -m 8G -bios fw_payload.bin
-device virtio-blk-devi
ce,drive=hd0 -object rng-random,filename=/dev/urandom,id=rng0 -device
virtio-rng-device,rng=rng0 -drive
file=riscv64-UbuntuFocal-qemu.qcow2,format=qcow2,id=hd0 -devi
- ce virtio-net-device,netdev=usernet -netdev user,id=usernet,$ports
+ ce virtio-net-device,netdev=usernet -netdev user,id=usernet,$ports
Interestingly this doesn't happen on the AMD64 version of Ubuntu 21.04
(fully updated).
-
Think you have everything already, but just in case:
$ lsb_release -rd
Description: Ubuntu Hirsute Hippo (development branch)
Release: 21.04
$ uname -a
Linux minimacvm 5.11.0-11-generic #12-Ubuntu SMP Mon Mar 1 19:27:36 UTC 2021
aarch64 aarch64 aarch64 GNU/Linux
(note this is a VM running on macOS/M1)
$ apt-cache policy qemu
qemu:
- Installed: 1:5.2+dfsg-9ubuntu1
- Candidate: 1:5.2+dfsg-9ubuntu1
- Version table:
- *** 1:5.2+dfsg-9ubuntu1 500
- 500 http://ports.ubuntu.com/ubuntu-ports hirsute/universe arm64
Packages
- 100 /var/lib/dpkg/status
+ Installed: 1:5.2+dfsg-9ubuntu1
+ Candidate: 1:5.2+dfsg-9ubuntu1
+ Version table:
+ *** 1:5.2+dfsg-9ubuntu1 500
+ 500 http://ports.ubuntu.com/ubuntu-ports hirsute/universe arm64
Packages
+ 100 /var/lib/dpkg/status
ProblemType: Bug
DistroRelease: Ubuntu 21.04
Package: qemu 1:5.2+dfsg-9ubuntu1
ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
Uname: Linux 5.11.0-11-generic aarch64
ApportVersion: 2.20.11-0ubuntu61
Architecture: arm64
CasperMD5CheckResult: unknown
CurrentDmesg:
- Error: command ['pkexec', 'dmesg'] failed with exit code 127:
polkit-agent-helper-1: error response to PolicyKit daemon:
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
- Error executing command as another user: Not authorized
-
- This incident has been reported.
+ Error: command ['pkexec', 'dmesg'] failed with exit code 127:
polkit-agent-helper-1: error response to PolicyKit daemon:
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
+ Error executing command as another user: Not authorized
+
+ This incident has been reported.
Date: Mon Mar 29 02:33:25 2021
Dependencies:
-
+
KvmCmdLine: COMMAND STAT EUID RUID PID PPID %CPU COMMAND
Lspci-vt:
- -[0000:00]-+-00.0 Apple Inc. Device f020
- +-01.0 Red Hat, Inc. Virtio network device
- +-05.0 Red Hat, Inc. Virtio console
- +-06.0 Red Hat, Inc. Virtio block device
- \-07.0 Red Hat, Inc. Virtio RNG
+ -[0000:00]-+-00.0 Apple Inc. Device f020
+ +-01.0 Red Hat, Inc. Virtio network device
+ +-05.0 Red Hat, Inc. Virtio console
+ +-06.0 Red Hat, Inc. Virtio block device
+ \-07.0 Red Hat, Inc. Virtio RNG
Lsusb: Error: command ['lsusb'] failed with exit code 1:
Lsusb-t:
-
+
Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
ProcEnviron:
- TERM=screen
- PATH=(custom, no user)
- XDG_RUNTIME_DIR=<set>
- LANG=C.UTF-8
- SHELL=/bin/bash
+ TERM=screen
+ PATH=(custom, no user)
+ XDG_RUNTIME_DIR=<set>
+ LANG=C.UTF-8
+ SHELL=/bin/bash
ProcKernelCmdLine: console=hvc0 root=/dev/vda
SourcePackage: qemu
UpgradeStatus: Upgraded to hirsute on 2020-12-30 (88 days ago)
acpidump:
- Error: command ['pkexec', '/usr/share/apport/dump_acpi_tables.py'] failed
with exit code 127: polkit-agent-helper-1: error response to PolicyKit daemon:
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
- Error executing command as another user: Not authorized
-
- This incident has been reported.
+ Error: command ['pkexec', '/usr/share/apport/dump_acpi_tables.py'] failed
with exit code 127: polkit-agent-helper-1: error response to PolicyKit daemon:
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
+ Error executing command as another user: Not authorized
+
+ This incident has been reported.
** Also affects: qemu
Importance: Undecided
Status: New
** Changed in: qemu (Ubuntu)
Importance: Undecided => Low
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1921664
Title:
Coroutines are racy for risc64 emu on arm64 - crash on Assertion
Status in QEMU:
New
Status in qemu package in Ubuntu:
Confirmed
Bug description:
Note: this could as well be "riscv64 on arm64" for being slow@slow and affect
other architectures as well.
The following case triggers on a Raspberry Pi4 running with arm64 on
Ubuntu 21.04 [1][2]. It might trigger on other environments as well,
but that is what we have seen it so far.
$ wget
https://github.com/carlosedp/riscv-bringup/releases/download/v1.0/UbuntuFocal-riscv64-QemuVM.tar.gz
$ tar xzf UbuntuFocal-riscv64-QemuVM.tar.gz
$ ./run_riscvVM.sh
(wait ~2 minutes)
[ OK ] Reached target Local File Systems (Pre).
[ OK ] Reached target Local File Systems.
Starting udev Kernel Device Manager...
qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57:
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
This is often, but not 100% reproducible and the cases differ slightly we
see either of:
- qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57:
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
- qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one:
Assertion `qemu_coroutine_self() == pool->main_co' failed.
Rebuilding working cases has shown to make them fail, as well as rebulding
(or even reinstalling) bad cases has made them work. Also the same builds on
different arm64 CPUs behave differently. TL;DR: The full list of conditions
influencing good/bad case here are not yet known.
[1]:
https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#1-overview
[2]:
http://cdimage.ubuntu.com/daily-preinstalled/pending/hirsute-preinstalled-desktop-arm64+raspi.img.xz
--- --- original report --- ---
I regularly run a RISC-V (RV64GC) QEMU VM, but an update a few days
ago broke it. Now when I launch it, it hits an assertion:
OpenSBI v0.6
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|
...
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
618 bytes read in 2 ms (301.8 KiB/s)
RISC-V Qemu Boot Options
1: Linux kernel-5.5.0-dirty
2: Linux kernel-5.5.0-dirty (recovery mode)
Enter choice: 1: Linux kernel-5.5.0-dirty
Retrieving file: /boot/initrd.img-5.5.0-dirty
qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one:
Assertion `qemu_coroutine_self() == pool->main_co' failed.
./run.sh: line 31: 1604 Aborted (core dumped)
qemu-system-riscv64 -machine virt -nographic -smp 8 -m 8G -bios fw_payload.bin
-device virtio-blk-devi
ce,drive=hd0 -object rng-random,filename=/dev/urandom,id=rng0 -device
virtio-rng-device,rng=rng0 -drive
file=riscv64-UbuntuFocal-qemu.qcow2,format=qcow2,id=hd0 -devi
ce virtio-net-device,netdev=usernet -netdev user,id=usernet,$ports
Interestingly this doesn't happen on the AMD64 version of Ubuntu 21.04
(fully updated).
Think you have everything already, but just in case:
$ lsb_release -rd
Description: Ubuntu Hirsute Hippo (development branch)
Release: 21.04
$ uname -a
Linux minimacvm 5.11.0-11-generic #12-Ubuntu SMP Mon Mar 1 19:27:36 UTC 2021
aarch64 aarch64 aarch64 GNU/Linux
(note this is a VM running on macOS/M1)
$ apt-cache policy qemu
qemu:
Installed: 1:5.2+dfsg-9ubuntu1
Candidate: 1:5.2+dfsg-9ubuntu1
Version table:
*** 1:5.2+dfsg-9ubuntu1 500
500 http://ports.ubuntu.com/ubuntu-ports hirsute/universe arm64
Packages
100 /var/lib/dpkg/status
ProblemType: Bug
DistroRelease: Ubuntu 21.04
Package: qemu 1:5.2+dfsg-9ubuntu1
ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
Uname: Linux 5.11.0-11-generic aarch64
ApportVersion: 2.20.11-0ubuntu61
Architecture: arm64
CasperMD5CheckResult: unknown
CurrentDmesg:
Error: command ['pkexec', 'dmesg'] failed with exit code 127:
polkit-agent-helper-1: error response to PolicyKit daemon:
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
Error executing command as another user: Not authorized
This incident has been reported.
Date: Mon Mar 29 02:33:25 2021
Dependencies:
KvmCmdLine: COMMAND STAT EUID RUID PID PPID %CPU COMMAND
Lspci-vt:
-[0000:00]-+-00.0 Apple Inc. Device f020
+-01.0 Red Hat, Inc. Virtio network device
+-05.0 Red Hat, Inc. Virtio console
+-06.0 Red Hat, Inc. Virtio block device
\-07.0 Red Hat, Inc. Virtio RNG
Lsusb: Error: command ['lsusb'] failed with exit code 1:
Lsusb-t:
Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
ProcEnviron:
TERM=screen
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=C.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: console=hvc0 root=/dev/vda
SourcePackage: qemu
UpgradeStatus: Upgraded to hirsute on 2020-12-30 (88 days ago)
acpidump:
Error: command ['pkexec', '/usr/share/apport/dump_acpi_tables.py'] failed
with exit code 127: polkit-agent-helper-1: error response to PolicyKit daemon:
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
Error executing command as another user: Not authorized
This incident has been reported.
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1921664/+subscriptions
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Bug 1921664] Re: Recent update broke qemu-system-riscv64,
Christian Ehrhardt <=