[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#33239: 'guix offload' regularly hangs in 'channel-get-exit-status' c
From: |
Ludovic Courtès |
Subject: |
bug#33239: 'guix offload' regularly hangs in 'channel-get-exit-status' call |
Date: |
Tue, 25 Dec 2018 17:49:00 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) |
Hello!
Ludovic Courtès <address@hidden> skribis:
> address@hidden (Ludovic Courtès) skribis:
>
>> The ‘guix offload’ processes on berlin regularly hang while calling
>> ‘channel-get-exit-status’:
>>
>> (gdb) bt
>> #0 0x00007f299fb330f1 in __GI___poll (fds=0x1dd58c0, nfds=1, timeout=-1) at
>> ../sysdeps/unix/sysv/linux/poll.c:29
>> #1 0x00007f2994287577 in ssh_poll_ctx_dopoll () from
>> target:/gnu/store/wmpg67bn7i7pqc0p4xjp1npnqixk9znd-libssh-0.7.6/lib/libssh.so.4
>> #2 0x00007f29942884d9 in ssh_handle_packets () from
>> target:/gnu/store/wmpg67bn7i7pqc0p4xjp1npnqixk9znd-libssh-0.7.6/lib/libssh.so.4
>> #3 0x00007f29942885ad in ssh_handle_packets_termination () from
>> target:/gnu/store/wmpg67bn7i7pqc0p4xjp1npnqixk9znd-libssh-0.7.6/lib/libssh.so.4
>> #4 0x00007f2994275080 in ssh_channel_get_exit_status () from
>> target:/gnu/store/wmpg67bn7i7pqc0p4xjp1npnqixk9znd-libssh-0.7.6/lib/libssh.so.4
>> #5 0x00007f29946dd11a in guile_ssh_channel_get_exit_status () from
>> target:/gnu/store/i3nfl17wfx7sryq6w15r9wxl7ilmq4rb-guile-ssh-0.11.3/lib/libguile-ssh.so.11
>> #6 0x00007f29a1765965 in vm_regular_engine (thread=0x1dd58c0, vp=0x1d4df30,
>> registers=0xffffffff, resume=-1615646479) at vm-engine.c:786
>
> I was able to come up with a reduced test case for Guile-SSH:
>
> https://github.com/artyom-poptsov/guile-ssh/issues/11
It turned out that the code to start a REPL server in (ssh dist node)
would currently hang, as I wrote in the bug report above.
After investigation, I decided that inferiors are more appropriate than
Guile-SSH’s node to address this use case, after all. Commit
ed7b44370f71126087eb953f36aad8dc4c44109f changes ‘guix offload’ to
inferiors.
As a result, build machines must now run Guix > 0.15.0, which provides
‘guix repl’. That in turn simplifies setup of build machines: no need
to fiddle with GUILE_LOAD_PATH.
On berlin, build machines were running an older Guix so I copied a
recently pulled Guix on each of them and installed it in
~/.config/guix/current. They’re now operational, except for the ARMv7
one which is still pulling. So far it seems to be working well but
we’ll have to keep an eye on it.
Thanks,
Ludo’.