bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wait -n misses signaled subprocess


From: Steven Pelley
Subject: wait -n misses signaled subprocess
Date: Mon, 22 Jan 2024 11:30:51 -0500

Hello,
I've encountered what I believe is a bug in bash's "wait -n".  wait -n
fails to return for processes that terminate due to a signal prior to
calling wait -n.  Instead, it returns 127 with an error that the
process id cannot be found.  Calling wait <pid> (without -n) then
returns its exit code (e.g., 143).  I expect wait -n to return each
process through successive calls to wait -n, which is the case for
processes that terminate in other manners even prior to calling wait
-n.  Killing a process while the wait -n is actively blocking works
correctly.  Test script at bottom.

The specific situation I encountered this is when trying to coordinate
my own cooperative exit and handling/propagating SIGTERM.  If I
propagate this SIGTERM by killing multiple processes at once (kill
pid1 pid2 pid3 ...) the next call to wait -n will return 143 and
indicate a pid (via -p) but the next call to wait -n returns 127 as
all processes previously terminated.  If any of the awaited processes
haven't yet terminated then you only discover the previously-killed
process whenever the next terminates.  I have workarounds/I'm not
blocked but this seems a reasonable use case and worth sharing.

I've tried:
killing with SIGTERM and SIGALRM
killing from the test script, a subshell, and another terminal.  I
don't believe this is related to kill being a builtin.
enabling job control (set -m)
bash versions 4.4.12, 5.2.15, 5.2.21.  All linux arm64

Test script:
# change to test other signals
sig=TERM

echo "TEST: KILL PRIOR TO wait -n @${SECONDS}"
{ sleep 1; exit 1; } &
pid=$!
echo "kill -$sig $pid @${SECONDS}"
kill -$sig $pid

sleep 2
wait -n $pid
echo "wait -n $pid return code $? @${SECONDS} (BUG)"
wait $pid
echo "wait $pid return code $? @${SECONDS}"

echo "TEST: KILL DURING wait -n @${SECONDS}"
{ sleep 2; exit 1; } &
pid=$!
{ sleep 1; echo "kill -$sig $pid @${SECONDS}"; kill -$sig $pid; } &

wait -n $pid
echo "wait -n $pid return code $? @${SECONDS}"
wait $pid
echo "wait $pid return code $? @${SECONDS}"


For which I get the following example output:
TEST: KILL PRIOR TO wait -n @0
kill -TERM 1384 @0
./test.sh: line 14:  1384 Terminated              { sleep 1; exit 1; }
wait -n 1384 return code 127 @2 (BUG)
wait 1384 return code 143 @2
TEST: KILL DURING wait -n @2
kill -TERM 1402 @3
./test.sh: line 25:  1402 Terminated              { sleep 2; exit 1; }
wait -n 1402 return code 143 @3
wait 1402 return code 143 @3

I expect the line ending (BUG) to indicate a return code of 143.

Thanks,
Steve Pelley



reply via email to

[Prev in Thread] Current Thread [Next in Thread]