bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wait -n misses signaled subprocess


From: Robert Elz
Subject: Re: wait -n misses signaled subprocess
Date: Tue, 30 Jan 2024 03:49:51 +0700

    Date:        Mon, 29 Jan 2024 12:07:53 -0500
    From:        Chet Ramey <chet.ramey@case.edu>
    Message-ID:  <fe9aafeb-673d-494a-9c82-2b81b67c1373@case.edu>

  | What does `wait -n' without job arguments mean?

Find, or if there are none already, wait*(2) for, a process (job technically)
that has changed state (terminated in POSIX, and one day in the NetBSD
shell, that difference isn't relevant here) and return its status.
If there's already a terminated job (job which has changed status in bash)
then no wait type sys call gets performed (that already happened).

It also returns the status of that process, rather than simple "0" which
a bare "wait" does (and with the appropriate arg, tells you which process
it was).

  | OK. Since wait without options can already wait for the same pid multiple
  | times, the -n option has to bring some new functionality here.

Yes, without args, it waits until all listed arg processes (jobs) are
finished (or changed state) and returns the status of the last.   With -n
it waits for any one of them, just as the bash man page says it will.
The "any one" (vs "all") is the new functionality.

  | As long as it's still in the jobs list.

Yes, of course - the final para of my message covered that case.

  | OK. We can agree there shouldn't be any difference between `wait pid'
  | and `wait -n pid'.

Yes, but just because that's a degenerate case of the more general commands,
which happens in each case to devolve into the same thing.

And from a different message:

chet.ramey@case.edu said:
  | So should the shell require the user to periodically run `wait' in a non-
  | interactive shell without job control to clean dead jobs out of the jobs
  | list? I don't think so. 

I do.   wait or jobs ("jobs >/dev/null" is a nice simple clean up, without
the potential hang waiting for things to terminate that the wait utility
imposes).   A new option to wait(1) (either a simple one, perhaps -t, to
only wait for already terminated jobs, or a timeout, where 0 indicates never
to wait at all (ie: don't do the wait sys call) which would be a more
general, but more costly, mechanism).   But as long as it is just a matter
of cleaning up, and jobs works for that, I don't currently see the need.

Of course, you're also allowed to dump processes from the lists if there
get to be too many of them, but on modern systems, it really should be
possible to retain hundreds, if not thousands, without any real problem.

And of course, you're not required to retain status of any job if there's
no way that the script can request it - but determining that these days is
difficult.  It used to be easy in the Sys V/POSIX model where if $! wasn't
saved, then there was no way for the script to request the status, as it
couldn't (reasonably - parsing job trees from ps output doesn't count) find
out the pid to wait for (and simple "wait" never returns any status).

These days, with the jobs command available, a script could do
        pids=$(jobs -l | code to parse the output and print the pids)
and determine what it can wait for that way (the code isn't difficult)
- and it can also wait on %1 %2 ... without having any idea what the pids
might be, so in practice adding the (non-trivial) code to monitor references
to $! isn't worth the bother (IMO).

It's also a bit unusual for non-interactive code to run lots of async jobs
without waiting for results - doing that is a sure way to run into the
"max user processes" limit, and have things start failing.   If there are
less than that, then having the shell retain the info until the script
terminates isn't really a very big cost, should the script not bother to
ever clean up.

  | I think it's whether or not `wait -n pid' behaves the same as `wait pid' and
  | looks in the list of saved exit statuses if the pid isn't found in a job in
  | the jobs list. 

We have it simpler than that, there's just one list, which serves both
purposes.  Makes things easier I believe (in all three of: shell code, shell
doc, and user understanding), even if it does consume a few more bytes for
a little longer than is really needed (jobs needs the command strings, so
they can be printed, wait doesn't, so retaining that is an extra cost ... not
one large enough for anyone to have ever noticed though).

kre




reply via email to

[Prev in Thread] Current Thread [Next in Thread]