bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#59493: cuirass-remote-worker crash


From: Ludovic Courtès
Subject: bug#59493: cuirass-remote-worker crash
Date: Wed, 23 Nov 2022 16:47:32 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Hi,

Mathieu Othacehe <othacehe@gnu.org> skribis:

>> 2022-11-21 14:27:24   1685:16  0 (raise-exception _ #:continuable? _)
>> 2022-11-21 14:27:24
>> 2022-11-21 14:27:24 ice-9/boot-9.scm:1685:16: In procedure raise-exception:
>> 2022-11-21 14:27:24 Throw to key `match-error' with args `("match" "no 
>> matching pattern" (#vu8()))'.
>
> Yes this is because a new remote-server is running on Berlin and it
> sends an empty sequence at every connection:
> https://git.savannah.gnu.org/cgit/guix/guix-cuirass.git/commit/?id=fc1641381d2a8a0472a71ef5ad2b64361faaaab4

Oh I see.  It would be nice to avoid non-backward-compatible changes in
the protocol so we can upgrade more smoothly.

> All remote-workers must update, and I have deployed Cuirass
> 1.1.0-13.1341725 on all hydra workers + guix9p.
>
> I have been trying to deploy that to overdrive1 for two days but Berlin
> offloads the builds to kreuzberg which has some issues because a lot of
> builds are timeouting:

Done now!

--8<---------------cut here---------------start------------->8---
ludo@overdrive1 ~$ guix system describe
Generation 37   Nov 23 2022 15:58:08    (current)
  file name: /var/guix/profiles/system-37-link
  canonical file name: /gnu/store/62dr875n7i30l375j87flbqfym78kddg-system
  label: GNU with Linux-Libre 6.0.9
  bootloader: grub-efi
  root device: /dev/sda3
  kernel: /gnu/store/p4impcxw8lba8600acrxs21lgzc06xzq-linux-libre-6.0.9/Image
  channels:
    guix:
      repository URL: https://git.savannah.gnu.org/git/guix.git
      commit: 78f03567f44f704dfbc03cb64368aa42a01e78ad
  configuration file: 
/gnu/store/myvzd1kpw2pfzfj3krl4lzpcbqsdn48x-configuration.scm
--8<---------------cut here---------------end--------------->8---

Running the Shepherd 0.9.3 and all, wonderful.

>> (Stuttering is due to the unprotected use of ‘primitive-fork’: a
>> non-local exit in the child leads it to execute the same code as its
>> parent.  We should fix that, but should we really fork in the first
>> place?  :-))

Fixed in Cuirass commit 9fb6f21d29c5398b35f4c1a77cf6c20f207c9ebb.

> Right, this is problematic. I can't remember why I chose to fork.

One concern is that, in the Avahi case, we create at least one thread
before forking, and as we know that doesn’t work (as in: it might work
sometimes).  ZMQ may also create threads behind our back.

The parent doesn’t call ‘waitpid’ on its children, which isn’t great.

To me, ideally this would be either multi-threaded or Fiberized.  The
latter would be more fruitful but what might be difficult is
guile-simple-zmq integration with Fibers (but maybe not: zmq_getsockopt
+ ZMQ_FD lets us get the file descriptor of a socket).

Something to consider…

Thanks,
Ludo’.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]