[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#59493: cuirass-remote-worker crash
From: |
Ludovic Courtès |
Subject: |
bug#59493: cuirass-remote-worker crash |
Date: |
Tue, 22 Nov 2022 23:14:05 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Hi,
In /var/log/cuirass-remote-worker.log on overdrive1.guix, I found this:
--8<---------------cut here---------------start------------->8---
2022-11-21 14:27:24 Backtrace:
2022-11-21 14:27:24 Backtrace:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 1752:10 10 (with-exception-handler _ _ #:unwind? _ # _)
2022-11-21 14:27:24 In unknown file:
2022-11-21 14:27:24 9 (apply-smob/0 #<thunk 3903a300>)
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 724:2 8 (call-with-prompt _ _ #<procedure
default-prompt-handle?>)
2022-11-21 14:27:24 In ice-9/eval.scm:
2022-11-21 14:27:24 1752:10 10 (with-exception-handler _ _ #:unwind? _ # _)
2022-11-21 14:27:24 619:8 7 (_ #(#(#<directory (guile-user) 3903dc80>)))
2022-11-21 14:27:24 In cuirass/ui.scm:
2022-11-21 14:27:24 In unknown file:
2022-11-21 14:27:24 9 (apply-smob/0 #<thunk 3903a300>)
2022-11-21 14:27:24 104:10 6 (run-cuirass-command _ . _)
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 724:2 8 (call-with-prompt _ _ #<procedure
default-prompt-handle?>)
2022-11-21 14:27:24 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _)
2022-11-21 14:27:24 In ice-9/eval.scm:
2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm:
2022-11-21 14:27:24 619:8 7 (_ #(#(#<directory (guile-user) 3903dc80>)))
2022-11-21 14:27:24 In cuirass/ui.scm:
2022-11-21 14:27:24 104:10 6 (run-cuirass-command _ . _)
2022-11-21 14:27:24 435:12 4 (_)
2022-11-21 14:27:24 In srfi/srfi-1.scm:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _)
2022-11-21 14:27:24 634:9 3 (for-each #<procedure 398a3510 at
cuirass/scripts/remo?> ?)
2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm:
2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm:
2022-11-21 14:27:24 448:18 2 (_ _)
2022-11-21 14:27:24 435:12 4 (_)
2022-11-21 14:27:24 In srfi/srfi-1.scm:
2022-11-21 14:27:24 634:9 3 (for-each #<procedure 398a3510 at
cuirass/scripts/remo?> ?)
2022-11-21 14:27:24 356:11 1 (start-worker _ _)
2022-11-21 14:27:24 In cuirass/scripts/remote-worker.scm:
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 448:18 2 (_ _)
2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _)
2022-11-21 14:27:24
2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception:
2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matching
pattern" (#vu8()))'.
2022-11-21 14:27:24 356:11 1 (start-worker _ _)
2022-11-21 14:27:24 In ice-9/boot-9.scm:
2022-11-21 14:27:24 1685:16 0 (raise-exception _ #:continuable? _)
2022-11-21 14:27:24
2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception:
2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no matching
pattern" (#vu8()))'.
--8<---------------cut here---------------end--------------->8---
(Stuttering is due to the unprotected use of ‘primitive-fork’: a
non-local exit in the child leads it to execute the same code as its
parent. We should fix that, but should we really fork in the first
place? :-))
This comes from here:
--8<---------------cut here---------------start------------->8---
(define (read-server-info socket)
(request-info socket)
(match (zmq-get-msg-parts-bytevector socket '()) ;<-- here
((empty info)
(match (zmq-read-message (bv->string info))
(('server-info
('worker-address worker-address)
('log-port log-port)
('publish-port publish-port))
(list worker-address log-port publish-port))))))
--8<---------------cut here---------------end--------------->8---
This is the version being used:
--8<---------------cut here---------------start------------->8---
ludo@overdrive1 ~$ cat /proc/24019/cmdline |xargs -0
/gnu/store/zpir9n73amaxrwz2k7x46l73v21vxk6s-guile-3.0.8/bin/guile
--no-auto-compile -e main -s
/gnu/store/rlqdzmfyamjpn6lz07yqk2hsabv3l7g5-cuirass-1.1.0-11.9f08035/bin/.cuirass-real
remote-worker --workers=2 --server=10.0.0.1:5555
--systems=armhf-linux,aarch64-linux --publish-port=5558
--substitute-urls=http://10.0.0.1
ludo@overdrive1 ~$ guix system describe
Generation 36 Sep 27 2022 09:06:48 (current)
file name: /var/guix/profiles/system-36-link
canonical file name: /gnu/store/m04qw6f0lfd0wpn1skiys4b56wqfc3b8-system
label: GNU with Linux-Libre 5.19.11
bootloader: grub-efi
root device: /dev/sda3
kernel: /gnu/store/09r4wbbabskmbrnwmshpdk7vh6g87gam-linux-libre-5.19.11/Image
channels:
guix:
repository URL: https://git.savannah.gnu.org/git/guix.git
commit: f15a141cf35bd4188767f0e91c0654991d4c49e0
configuration file:
/gnu/store/myvzd1kpw2pfzfj3krl4lzpcbqsdn48x-configuration.scm
--8<---------------cut here---------------end--------------->8---
The sequence leading to this seems to be:
--8<---------------cut here---------------start------------->8---
22340 eventfd2(0, EFD_CLOEXEC <unfinished ...>
[…]
22340 <... eventfd2 resumed>) = 15
[…]
22340 ppoll([{fd=15, events=POLLIN}], 1, NULL, NULL, 0 <unfinished ...>
[…]
22340 <... ppoll resumed>) = 1 ([{fd=15, revents=POLLIN}])
22343 epoll_pwait(8, <unfinished ...>
22340 read(15, "\1\0\0\0\0\0\0\0", 8) = 8
22340 ppoll([{fd=15, events=POLLIN}], 1, {tv_sec=0, tv_nsec=0}, NULL, 0) = 0
(Timeout)
22340 write(2, "Backtrace:\n", 11) = 11
--8<---------------cut here---------------end--------------->8---
Does that ring a bell? Perhaps that was fixed in the meantime?
Right now it cannot be restarted: it always fails at start up with the
error above. 10.0.0.1 is reachable though so I’m not sure what’s up.
Ludo’.
- bug#59493: cuirass-remote-worker crash,
Ludovic Courtès <=