bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#66692: [cuirass] Spurious failed builds following patching u-boot so


From: Ludovic Courtès
Subject: bug#66692: [cuirass] Spurious failed builds following patching u-boot sources
Date: Mon, 23 Oct 2023 22:49:56 +0200
User-agent: Gnus/5.13 (Gnus v5.13)

Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> The build failures are not captured in the logs, which appear truncated,
> e.g. from <https://ci.guix.gnu.org/build/2294052/details> (log empty --
> builder: sjd-p9, worker N38vG7OW) 

On sjd-p9, I get:

--8<---------------cut here---------------start------------->8---
root@sjd-p9:~# journalctl -u cuirass-remote-worker.service |grep -C2 
s167lh0w0azfxx83ri8k0mn2gj57i5rl
Oct 23 00:06:21 sjd-p9 cuirass[531]: N38vG7OW: derivation 
/gnu/store/kcv6hfqz94f1vdw24viw7d9ih6v9zi1j-libkmahjongg-23.04.3.drv build 
succeeded.
Oct 23 00:06:21 sjd-p9 cuirass[531]: N38vG7OW: request work.
Oct 23 00:06:21 sjd-p9 cuirass[531]: N38vG7OW: building derivation 
`/gnu/store/s167lh0w0azfxx83ri8k0mn2gj57i5rl-u-boot-rpi-4-32b-2023.07.02.drv' 
(system: powerpc64le-linux)
Oct 23 00:06:32 sjd-p9 cuirass[531]: C2q41fts: ping tcp://10.0.0.1:5555.
Oct 23 00:06:32 sjd-p9 cuirass[531]: BrnR2Sju: ping tcp://10.0.0.1:5555.
--
Oct 23 00:34:16 sjd-p9 cuirass[531]: vA6cYbb5: request work.
Oct 23 00:34:16 sjd-p9 cuirass[531]: vA6cYbb5: building derivation 
`/gnu/store/lj4h1g1zq235l6i8mjh88s1ylvj7ksxd-u-boot-pine64-plus-2023.07.02.drv' 
(system: powerpc64le-linux)
Oct 23 00:34:29 sjd-p9 cuirass[531]: N38vG7OW: derivation 
/gnu/store/s167lh0w0azfxx83ri8k0mn2gj57i5rl-u-boot-rpi-4-32b-2023.07.02.drv 
build failed.
Oct 23 00:34:29 sjd-p9 cuirass[531]: N38vG7OW: request work.
Oct 23 00:34:29 sjd-p9 cuirass[531]: N38vG7OW: building derivation 
`/gnu/store/dy8g5pi1gv8dza1av9957ic6zzj4l88j-u-boot-tools-2023.07.02.drv' 
(system: powerpc64le-linux)
root@sjd-p9:~# guix build --log-file 
/gnu/store/s167lh0w0azfxx83ri8k0mn2gj57i5rl-u-boot-rpi-4-32b-2023.07.02.drv
https://ci.guix.gnu.org/log/0yzfwl2b017gsnf60wvr7dnn2g26wrj7-u-boot-rpi-4-32b-2023.07.02
--8<---------------cut here---------------end--------------->8---

So it looks as though the build was attempted and failed, but there’s no
local build log in /var/log/guix/drvs, meaning that, most likely, no
build process was even started.

The log of guix-daemon provide few hints:

--8<---------------cut here---------------start------------->8---
root@sjd-p9:~# journalctl -u guix-daemon |grep 'Oct 23 00:'

[...]

Oct 23 00:06:21 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:06:21 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:06:55 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:07:51 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:16:25 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:17:15 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:33:42 sjd-p9 guix-daemon[2097426]: SIGPOLL
Oct 23 00:33:42 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:33:49 sjd-p9 guix-daemon[2097426]: unexpected build daemon error: 
interrupted by the user
Oct 23 00:34:16 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:34:16 sjd-p9 guix-daemon[1990649]: SIGPOLL
Oct 23 00:34:23 sjd-p9 guix-daemon[1990649]: unexpected build daemon error: 
interrupted by the user
Oct 23 00:34:29 sjd-p9 guix-daemon[2052532]: SIGPOLL
Oct 23 00:34:29 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:34:30 sjd-p9 guix-daemon[2052532]: unexpected build daemon error: 
interrupted by the user
Oct 23 00:34:38 sjd-p9 guix-daemon[1996946]: SIGPOLL
Oct 23 00:34:38 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:34:38 sjd-p9 guix-daemon[1996946]: unexpected build daemon error: 
interrupted by the user
Oct 23 00:34:45 sjd-p9 guix-daemon[2248556]: SIGPOLL
Oct 23 00:34:45 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:34:45 sjd-p9 guix-daemon[2248556]: unexpected build daemon error: 
interrupted by the user
Oct 23 00:35:33 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
Oct 23 00:36:46 sjd-p9 guix-daemon[534]: accepted connection from pid 531, user 
root
--8<---------------cut here---------------end--------------->8---

One possibility would have been that ‘guix gc’ was running, preventing
the build process from starting (“big GC lock” taken); however, if this
were the case, we’d see a connection from a different process in the
guix-daemon log.

> or from <https://ci.guix.gnu.org/build/2294218/details> (builder:
> hydra-guix-107, worker 4CWieLpQ) which ends with:

Here we get:

--8<---------------cut here---------------start------------->8---
root@hydra-guix-107 ~# grep -C2 jy463789qf42dn4zk5i5jzry2lf82ypy 
/var/log/cuirass-remote-worker.log
2023-10-23 00:17:34 vDKeRcHy: no available build.
2023-10-23 00:17:46 4CWieLpQ: request work.
2023-10-23 00:17:46 4CWieLpQ: building derivation 
`/gnu/store/jy463789qf42dn4zk5i5jzry2lf82ypy-u-boot-cubietruck-2023.07.02.drv' 
(system: x86_64-linux)
2023-10-23 00:18:03 zARGXQMa: request work.
2023-10-23 00:18:03 zARGXQMa: no available build.
--
2023-10-23 00:27:35 vDKeRcHy: request work.
2023-10-23 00:27:35 vDKeRcHy: no available build.
2023-10-23 00:27:42 4CWieLpQ: derivation 
/gnu/store/jy463789qf42dn4zk5i5jzry2lf82ypy-u-boot-cubietruck-2023.07.02.drv 
build failed.
2023-10-23 00:27:42 4CWieLpQ: request work.
2023-10-23 00:27:42 4CWieLpQ: no available build.
root@hydra-guix-107 ~# ls /var/log/guix/drvs/*/*u-boot-cubietruck-2023*
ls: cannot access '/var/log/guix/drvs/*/*u-boot-cubietruck-2023*': No such file 
or directory
--8<---------------cut here---------------end--------------->8---

And likewise:

--8<---------------cut here---------------start------------->8---
root@hydra-guix-107 ~# grep -C2 '10-23 00:27:' /var/log/guix-daemon.log
2023-10-23 00:17:46 accepted connection from pid 64022, user root
2023-10-23 00:19:34 accepted connection from pid 64022, user root
2023-10-23 00:27:42 SIGPOLL
2023-10-23 00:27:45 unexpected build daemon error: interrupted by the user
2023-10-23 00:33:33 accepted connection from pid 18016, user hydra
2023-10-23 00:33:34 accepted connection from pid 18035, user hydra
--8<---------------cut here---------------end--------------->8---

But hey, as a matter of fact, ‘cuirass remote-server’ on berlin was
restarted at around that time, as can be seen in
/var/log/cuirass-remote-server.log:

--8<---------------cut here---------------start------------->8---
2023-10-23 00:26:06 0 items queued for eventual download
2023-10-23 00:26:55 period update: 0 resumable, 0 failed builds.
2023-10-23 00:26:58 error: failed to start worker/database proxy: Invalid 
argument
2023-10-23 00:26:59 increased maximum number of open files from 1024 to 2048
2023-10-23 00:26:59 listening for build logs on port 5556
2023-10-23 00:26:59 deleting old build logs from '/var/cache/cuirass/remote/'...
2023-10-23 00:28:49 selected 1115 build logs to remove
2023-10-23 00:28:49 WARNING: (cuirass scripts remote-server): imported module 
(fibers) overrides core binding `sleep'
2023-10-23 00:28:49 starting fetch worker with up to 8 concurrent downloads
2023-10-23 00:28:49 build failed: 
'/gnu/store/qlmhcyk9ac30464v99i2ys0irs7jz3lg-u-boot-am335x-evm-2023.07.02.drv'
2023-10-23 00:28:49 build failed: 
'/gnu/store/dl3far2xa4paq6nl6pp9iklpxrwg1wy2-ungoogled-chromium-wayland-112.0.5615.165-1.drv'
2023-10-23 00:28:50 build failed: 
'/gnu/store/y9fwzzgvyhvrmf0avihm6944j8rvpg6m-u-boot-mx6cuboxi-2023.07.02.drv'
2023-10-23 00:28:50 build failed: 
'/gnu/store/5bhl5q86j6bkqx8zybq29khndhf050jm-u-boot-rpi-2-2023.07.02.drv'
2023-10-23 00:28:50 build failed: 
'/gnu/store/16d5vq3dlzy1f4s7vazlfg6358bl8dlc-u-boot-qemu-arm-2023.07.02.drv'
2023-10-23 00:28:51 build failed: 
'/gnu/store/i88vczhcnlz462cmpls8qgx5n3ssw7rc-u-boot-qemu-arm-2023.07.02.drv'
2023-10-23 00:28:51 build failed: 
'/gnu/store/jy463789qf42dn4zk5i5jzry2lf82ypy-u-boot-cubietruck-2023.07.02.drv'
2023-10-23 00:28:51 build failed: 
'/gnu/store/i1ri2nkh1bcfzynmggbjg0rjlyi121y8-u-boot-rpi-3-32b-2023.07.02.drv'
2023-10-23 00:28:51 build failed: 
'/gnu/store/q08xw052r484c1c1nj31sx0fg5yf2py9-openjdk-18.0.2.drv'
2023-10-23 00:28:52 build failed: 
'/gnu/store/5j4pp2b5dx2ykw4na2wbmiwp84g0hi4i-u-boot-am335x-boneblack-2023.07.02.drv'
2023-10-23 00:28:52 build failed: 
'/gnu/store/nqp61id4bavvcsib307rw5234myl7ykr-u-boot-rpi-3-32b-efi-2023.07.02.drv'
2023-10-23 00:28:52 build failed: 
'/gnu/store/9h7x080cn44ik79mrzycbg2x0dvvgigc-u-boot-a20-olinuxino-lime-2023.07.02.drv'
2023-10-23 00:28:52 build failed: 
'/gnu/store/mkwy8ywa6mrw5zdxwr88skyw94afgaqv-u-boot-novena-2023.07.02.drv'
2023-10-23 00:28:53 build failed: 
'/gnu/store/lfjp146k0z0w2yqd3y6m2mv09fq01948-u-boot-sifive-unmatched-2023.07.02.drv'
2023-10-23 00:28:53 build failed: 
'/gnu/store/425ykg0ci4i612ryc7z79as94vipp9rb-u-boot-rpi-2-efi-2023.07.02.drv'
2023-10-23 00:28:57 build failed: 
'/gnu/store/7fyq24jsmvclnapjgf1li43w6dmm5n06-u-boot-mx6cuboxi-2023.07.02.drv'
[…]
2023-10-23 00:34:29 build failed: 
'/gnu/store/s167lh0w0azfxx83ri8k0mn2gj57i5rl-u-boot-rpi-4-32b-2023.07.02.drv'
--8<---------------cut here---------------end--------------->8---

/var/log/messages shows that ‘remote-server’ failed and was respawned
(the failure manifests as “failed to start worker/database proxy” in the
cuirass-remote-server.log excerpt above):

--8<---------------cut here---------------start------------->8---
Oct 23 00:24:33 localhost vmunix: [12534110.756389] BTRFS error (device sde2): 
parent transid verify failed on logical 30457856 mirror 2 wanted 61709 found 11
Oct 23 00:24:33 localhost vmunix: [12534110.768173] BTRFS error (device sde2): 
parent transid verify failed on logical 30457856 mirror 1 wanted 61709 found 11
Oct 23 00:24:34 localhost vmunix: [12534110.779454] BTRFS error (device sde2): 
parent transid verify failed on logical 30457856 mirror 2 wanted 61709 found 11
Oct 23 00:24:59 localhost ntpd[1953]: Soliciting pool server 81.169.208.5
Oct 23 00:26:04 localhost ntpd[1953]: Soliciting pool server 141.98.136.83
Oct 23 00:26:58 localhost shepherd[1]: Service cuirass-remote-server (PID 
92414) exited with 1. 
Oct 23 00:26:58 localhost shepherd[1]: Respawning cuirass-remote-server. 
Oct 23 00:26:58 localhost shepherd[1]: Starting service user-homes... 
Oct 23 00:26:58 localhost shepherd[1]: Service user-homes has been started. 
Oct 23 00:26:58 localhost shepherd[1]: Service user-homes started. 
Oct 23 00:26:58 localhost shepherd[1]: Service user-homes running with value 
#t. 
Oct 23 00:26:58 localhost shepherd[1]: Starting service postgres-roles... 
Oct 23 00:26:58 localhost shepherd[1]: Service postgres-roles has been started. 
Oct 23 00:26:58 localhost shepherd[1]: Service postgres-roles started. 
Oct 23 00:26:58 localhost shepherd[1]: Service postgres-roles running with 
value #t. 
Oct 23 00:26:58 localhost shepherd[1]: Starting service 
cuirass-remote-server... 
Oct 23 00:26:58 localhost shepherd[1]: Service cuirass-remote-server has been 
started. 
Oct 23 00:26:58 localhost shepherd[1]: Service cuirass-remote-server started. 
Oct 23 00:26:58 localhost shepherd[1]: Service cuirass-remote-server running 
with value 63052. 
Oct 23 00:26:59 localhost vmunix: [12534255.969754] validate_extent_buffer: 58 
callbacks suppressed
Oct 23 00:26:59 localhost vmunix: [12534255.969764] BTRFS error (device sde2): 
parent transid verify failed on logical 30457856 mirror 1 wanted 61709 found 11
--8<---------------cut here---------------end--------------->8---

To summarize:

  1. ‘cuirass remote-server’ failed for some unknown reason and was
     respawned.

  2. All the ongoing and about-to-be-started builds carried out by
     worker processes at that point were marked as failed.

We can improve on both parts.

To be continued…

Ludo’.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]