qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug 1921664] Re: Recent update broke qemu-system-riscv64


From: Christian Ehrhardt 
Subject: [Bug 1921664] Re: Recent update broke qemu-system-riscv64
Date: Wed, 14 Apr 2021 06:24:54 -0000

Also I've rebuilt the most recent master c1e90def01 about ~55 commits newer 
than 6.0-rc2.
As in the experiments of Tommy I was unable to reproduce it there.
But with the data from the tests before it is very likely that this is more
likely an accident by having a slightly different timing than a fix (to be
clear I'd appreciate if there is a fix, I'm just unable to derive from this
being good I could e.g. bisect).

export CFLAGS="-O0 -g -fPIC"
../configure --enable-system --disable-xen --disable-werror --disable-docs 
--disable-libudev --disable-guest-agent --disable-sdl --disable-gtk 
--disable-vnc --disable-xen --disable-brlapi  --disable-hax --disable-vde 
--disable-netmap --disable-rbd --disable-libiscsi --disable-libnfs 
--disable-smartcard --disable-libusb --disable-usb-redir --disable-seccomp 
--disable-glusterfs --disable-tpm --disable-numa --disable-opengl 
--disable-virglrenderer --disable-xfsctl --disable-slirp --disable-blobs 
--disable-rdma --disable-pvrdma --disable-attr --disable-vhost-net 
--disable-vhost-vsock --disable-vhost-scsi --disable-vhost-crypto 
--disable-vhost-user --disable-spice --disable-qom-cast-debug --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --enable-tools 
--disable-libssh --disable-libpmem --disable-cap-ng --disable-vte 
--disable-iconv --disable-curses --disable-linux-aio --disable-linux-io-uring 
--disable-kvm --disable-replication --audio-drv-list="" --disable-vhost-kernel 
--disable-vhost-vdpa --disable-live-block-migration --disable-keyring 
--disable-auth-pam --disable-curl --disable-strip --enable-fdt 
--target-list="riscv64-softmmu"
make -j10

Just like the package build that configures as
   coroutine backend: ucontext
   coroutine pool: YES

5/5 runs with that were ok
But since we know it is racy I'm unsure if that implies much :-/

P.S. I have not yet went into a build-option bisect, but chances are it could be
related. But that is too much stabbing in the dark, maybe someone experienced
in the coroutines code can already make sense of all the info we have gathered 
so
far.
I'll update the bug description and add an upstream task to have all the info 
we have get mirrored to the qemu mailing lists.

** Summary changed:

- Recent update broke qemu-system-riscv64
+ Coroutines are racy for risc64 emu on arm64 - crash on Assertion

** Description changed:

+ Note: this could as well be "riscv64 on arm64" for being slow@slow and affect
+ other architectures as well.
+ 
+ The following case triggers on a Raspberry Pi4 running with arm64 on
+ Ubuntu 21.04 [1][2]. It might trigger on other environments as well,
+ but that is what we have seen it so far.
+ 
+    $ wget 
https://github.com/carlosedp/riscv-bringup/releases/download/v1.0/UbuntuFocal-riscv64-QemuVM.tar.gz
+    $ tar xzf UbuntuFocal-riscv64-QemuVM.tar.gz
+    $ ./run_riscvVM.sh
+ (wait ~2 minutes)
+    [ OK ] Reached target Local File Systems (Pre).
+    [ OK ] Reached target Local File Systems.
+             Starting udev Kernel Device Manager...
+ qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57: 
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
+ 
+ This is often, but not 100% reproducible and the cases differ slightly we
+ see either of:
+ - qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57: 
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
+ - qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: 
Assertion `qemu_coroutine_self() == pool->main_co' failed.
+ 
+ Rebuilding working cases has shown to make them fail, as well as rebulding
+ (or even reinstalling) bad cases has made them work. Also the same builds on
+ different arm64 CPUs behave differently. TL;DR: The full list of conditions
+ influencing good/bad case here are not yet known.
+ 
+ [1]: 
https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#1-overview
+ [2]: 
http://cdimage.ubuntu.com/daily-preinstalled/pending/hirsute-preinstalled-desktop-arm64+raspi.img.xz
+ 
+ 
+ --- --- original report --- ---
+ 
  I regularly run a RISC-V (RV64GC) QEMU VM, but an update a few days ago
  broke it.  Now when I launch it, it hits an assertion:
  
-                                                                               
    
- OpenSBI v0.6                                                                  
    
-    ____                    _____ ____ _____                             
-   / __ \                  / ____|  _ \_   _|                                  
    
-  | |  | |_ __   ___ _ __ | (___ | |_) || |                                    
    
-  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |                                    
                                                                                
       
-  | |__| | |_) |  __/ | | |____) | |_) || |_                                   
                                                                                
       
-   \____/| .__/ \___|_| |_|_____/|____/_____|                                  
    
-         | |                                                                   
                                                                                
       
-         |_|                                                                   
    
-                                                                               
    
+ OpenSBI v0.6
+    ____                    _____ ____ _____
+   / __ \                  / ____|  _ \_   _|
+  | |  | |_ __   ___ _ __ | (___ | |_) || |
+  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
+  | |__| | |_) |  __/ | | |____) | |_) || |_
+   \____/| .__/ \___|_| |_|_____/|____/_____|
+         | |
+         |_|
+ 
  ...
- Found /boot/extlinux/extlinux.conf                                            
                                                                                
       
- Retrieving file: /boot/extlinux/extlinux.conf                                 
                                                                                
       
- 618 bytes read in 2 ms (301.8 KiB/s)                                          
    
- RISC-V Qemu Boot Options                                                      
    
- 1:      Linux kernel-5.5.0-dirty         
- 2:      Linux kernel-5.5.0-dirty (recovery mode)                            
- Enter choice: 1:        Linux kernel-5.5.0-dirty                              
    
- Retrieving file: /boot/initrd.img-5.5.0-dirty                                 
                                                                                
       
- qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: 
Assertion `qemu_coroutine_self() == pool->main_co' failed.                      
             
+ Found /boot/extlinux/extlinux.conf
+ Retrieving file: /boot/extlinux/extlinux.conf
+ 618 bytes read in 2 ms (301.8 KiB/s)
+ RISC-V Qemu Boot Options
+ 1:      Linux kernel-5.5.0-dirty
+ 2:      Linux kernel-5.5.0-dirty (recovery mode)
+ Enter choice: 1:        Linux kernel-5.5.0-dirty
+ Retrieving file: /boot/initrd.img-5.5.0-dirty
+ qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: 
Assertion `qemu_coroutine_self() == pool->main_co' failed.
  ./run.sh: line 31:  1604 Aborted                 (core dumped) 
qemu-system-riscv64 -machine virt -nographic -smp 8 -m 8G -bios fw_payload.bin 
-device virtio-blk-devi
  ce,drive=hd0 -object rng-random,filename=/dev/urandom,id=rng0 -device 
virtio-rng-device,rng=rng0 -drive 
file=riscv64-UbuntuFocal-qemu.qcow2,format=qcow2,id=hd0 -devi
- ce virtio-net-device,netdev=usernet -netdev user,id=usernet,$ports            
    
+ ce virtio-net-device,netdev=usernet -netdev user,id=usernet,$ports
  
  Interestingly this doesn't happen on the AMD64 version of Ubuntu 21.04
  (fully updated).
- 
  
  Think you have everything already, but just in case:
  
  $ lsb_release -rd
  Description:    Ubuntu Hirsute Hippo (development branch)
  Release:        21.04
  
  $ uname -a
  Linux minimacvm 5.11.0-11-generic #12-Ubuntu SMP Mon Mar 1 19:27:36 UTC 2021 
aarch64 aarch64 aarch64 GNU/Linux
  (note this is a VM running on macOS/M1)
  
  $ apt-cache policy qemu
  qemu:
-   Installed: 1:5.2+dfsg-9ubuntu1
-   Candidate: 1:5.2+dfsg-9ubuntu1
-   Version table:
-  *** 1:5.2+dfsg-9ubuntu1 500
-         500 http://ports.ubuntu.com/ubuntu-ports hirsute/universe arm64 
Packages
-         100 /var/lib/dpkg/status
+   Installed: 1:5.2+dfsg-9ubuntu1
+   Candidate: 1:5.2+dfsg-9ubuntu1
+   Version table:
+  *** 1:5.2+dfsg-9ubuntu1 500
+         500 http://ports.ubuntu.com/ubuntu-ports hirsute/universe arm64 
Packages
+         100 /var/lib/dpkg/status
  
  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: qemu 1:5.2+dfsg-9ubuntu1
  ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
  Uname: Linux 5.11.0-11-generic aarch64
  ApportVersion: 2.20.11-0ubuntu61
  Architecture: arm64
  CasperMD5CheckResult: unknown
  CurrentDmesg:
-  Error: command ['pkexec', 'dmesg'] failed with exit code 127: 
polkit-agent-helper-1: error response to PolicyKit daemon: 
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
-  Error executing command as another user: Not authorized
-  
-  This incident has been reported.
+  Error: command ['pkexec', 'dmesg'] failed with exit code 127: 
polkit-agent-helper-1: error response to PolicyKit daemon: 
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
+  Error executing command as another user: Not authorized
+ 
+  This incident has been reported.
  Date: Mon Mar 29 02:33:25 2021
  Dependencies:
-  
+ 
  KvmCmdLine: COMMAND         STAT  EUID  RUID     PID    PPID %CPU COMMAND
  Lspci-vt:
-  -[0000:00]-+-00.0  Apple Inc. Device f020
-             +-01.0  Red Hat, Inc. Virtio network device
-             +-05.0  Red Hat, Inc. Virtio console
-             +-06.0  Red Hat, Inc. Virtio block device
-             \-07.0  Red Hat, Inc. Virtio RNG
+  -[0000:00]-+-00.0  Apple Inc. Device f020
+             +-01.0  Red Hat, Inc. Virtio network device
+             +-05.0  Red Hat, Inc. Virtio console
+             +-06.0  Red Hat, Inc. Virtio block device
+             \-07.0  Red Hat, Inc. Virtio RNG
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
-  
+ 
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  ProcEnviron:
-  TERM=screen
-  PATH=(custom, no user)
-  XDG_RUNTIME_DIR=<set>
-  LANG=C.UTF-8
-  SHELL=/bin/bash
+  TERM=screen
+  PATH=(custom, no user)
+  XDG_RUNTIME_DIR=<set>
+  LANG=C.UTF-8
+  SHELL=/bin/bash
  ProcKernelCmdLine: console=hvc0 root=/dev/vda
  SourcePackage: qemu
  UpgradeStatus: Upgraded to hirsute on 2020-12-30 (88 days ago)
  acpidump:
-  Error: command ['pkexec', '/usr/share/apport/dump_acpi_tables.py'] failed 
with exit code 127: polkit-agent-helper-1: error response to PolicyKit daemon: 
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
-  Error executing command as another user: Not authorized
-  
-  This incident has been reported.
+  Error: command ['pkexec', '/usr/share/apport/dump_acpi_tables.py'] failed 
with exit code 127: polkit-agent-helper-1: error response to PolicyKit daemon: 
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
+  Error executing command as another user: Not authorized
+ 
+  This incident has been reported.

** Also affects: qemu
   Importance: Undecided
       Status: New

** Changed in: qemu (Ubuntu)
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1921664

Title:
  Coroutines are racy for risc64 emu on arm64 - crash on Assertion

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Note: this could as well be "riscv64 on arm64" for being slow@slow and affect
  other architectures as well.

  The following case triggers on a Raspberry Pi4 running with arm64 on
  Ubuntu 21.04 [1][2]. It might trigger on other environments as well,
  but that is what we have seen it so far.

     $ wget 
https://github.com/carlosedp/riscv-bringup/releases/download/v1.0/UbuntuFocal-riscv64-QemuVM.tar.gz
     $ tar xzf UbuntuFocal-riscv64-QemuVM.tar.gz
     $ ./run_riscvVM.sh
  (wait ~2 minutes)
     [ OK ] Reached target Local File Systems (Pre).
     [ OK ] Reached target Local File Systems.
              Starting udev Kernel Device Manager...
  qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57: 
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.

  This is often, but not 100% reproducible and the cases differ slightly we
  see either of:
  - qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57: 
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
  - qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: 
Assertion `qemu_coroutine_self() == pool->main_co' failed.

  Rebuilding working cases has shown to make them fail, as well as rebulding
  (or even reinstalling) bad cases has made them work. Also the same builds on
  different arm64 CPUs behave differently. TL;DR: The full list of conditions
  influencing good/bad case here are not yet known.

  [1]: 
https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#1-overview
  [2]: 
http://cdimage.ubuntu.com/daily-preinstalled/pending/hirsute-preinstalled-desktop-arm64+raspi.img.xz

  
  --- --- original report --- ---

  I regularly run a RISC-V (RV64GC) QEMU VM, but an update a few days
  ago broke it.  Now when I launch it, it hits an assertion:

  OpenSBI v0.6
     ____                    _____ ____ _____
    / __ \                  / ____|  _ \_   _|
   | |  | |_ __   ___ _ __ | (___ | |_) || |
   | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
   | |__| | |_) |  __/ | | |____) | |_) || |_
    \____/| .__/ \___|_| |_|_____/|____/_____|
          | |
          |_|

  ...
  Found /boot/extlinux/extlinux.conf
  Retrieving file: /boot/extlinux/extlinux.conf
  618 bytes read in 2 ms (301.8 KiB/s)
  RISC-V Qemu Boot Options
  1:      Linux kernel-5.5.0-dirty
  2:      Linux kernel-5.5.0-dirty (recovery mode)
  Enter choice: 1:        Linux kernel-5.5.0-dirty
  Retrieving file: /boot/initrd.img-5.5.0-dirty
  qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: 
Assertion `qemu_coroutine_self() == pool->main_co' failed.
  ./run.sh: line 31:  1604 Aborted                 (core dumped) 
qemu-system-riscv64 -machine virt -nographic -smp 8 -m 8G -bios fw_payload.bin 
-device virtio-blk-devi
  ce,drive=hd0 -object rng-random,filename=/dev/urandom,id=rng0 -device 
virtio-rng-device,rng=rng0 -drive 
file=riscv64-UbuntuFocal-qemu.qcow2,format=qcow2,id=hd0 -devi
  ce virtio-net-device,netdev=usernet -netdev user,id=usernet,$ports

  Interestingly this doesn't happen on the AMD64 version of Ubuntu 21.04
  (fully updated).

  Think you have everything already, but just in case:

  $ lsb_release -rd
  Description:    Ubuntu Hirsute Hippo (development branch)
  Release:        21.04

  $ uname -a
  Linux minimacvm 5.11.0-11-generic #12-Ubuntu SMP Mon Mar 1 19:27:36 UTC 2021 
aarch64 aarch64 aarch64 GNU/Linux
  (note this is a VM running on macOS/M1)

  $ apt-cache policy qemu
  qemu:
    Installed: 1:5.2+dfsg-9ubuntu1
    Candidate: 1:5.2+dfsg-9ubuntu1
    Version table:
   *** 1:5.2+dfsg-9ubuntu1 500
          500 http://ports.ubuntu.com/ubuntu-ports hirsute/universe arm64 
Packages
          100 /var/lib/dpkg/status

  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: qemu 1:5.2+dfsg-9ubuntu1
  ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
  Uname: Linux 5.11.0-11-generic aarch64
  ApportVersion: 2.20.11-0ubuntu61
  Architecture: arm64
  CasperMD5CheckResult: unknown
  CurrentDmesg:
   Error: command ['pkexec', 'dmesg'] failed with exit code 127: 
polkit-agent-helper-1: error response to PolicyKit daemon: 
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
   Error executing command as another user: Not authorized

   This incident has been reported.
  Date: Mon Mar 29 02:33:25 2021
  Dependencies:

  KvmCmdLine: COMMAND         STAT  EUID  RUID     PID    PPID %CPU COMMAND
  Lspci-vt:
   -[0000:00]-+-00.0  Apple Inc. Device f020
              +-01.0  Red Hat, Inc. Virtio network device
              +-05.0  Red Hat, Inc. Virtio console
              +-06.0  Red Hat, Inc. Virtio block device
              \-07.0  Red Hat, Inc. Virtio RNG
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:

  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcKernelCmdLine: console=hvc0 root=/dev/vda
  SourcePackage: qemu
  UpgradeStatus: Upgraded to hirsute on 2020-12-30 (88 days ago)
  acpidump:
   Error: command ['pkexec', '/usr/share/apport/dump_acpi_tables.py'] failed 
with exit code 127: polkit-agent-helper-1: error response to PolicyKit daemon: 
GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: No session for cookie
   Error executing command as another user: Not authorized

   This incident has been reported.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1921664/+subscriptions



reply via email to

[Prev in Thread] Current Thread [Next in Thread]