bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Examples of concurrent coproc usage?


From: Carl Edquist
Subject: Re: Examples of concurrent coproc usage?
Date: Thu, 14 Mar 2024 04:58:48 -0500 (CDT)

[My apologies up front for the length of this email. The short story is I played around with the multi-coproc support: the fd closing seems to work fine to prevent deadlock, but I found one bug apparently introduced with multi-coproc support, and one other coproc bug that is not new.]

On Mon, 11 Mar 2024, Zachary Santer wrote:

Was "RFE: enable buffering on null-terminated data"

On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist <edquist@cs.wisc.edu> wrote:

(Kind of a side-note ... bash's limited coprocess handling was a long standing annoyance for me in the past, to the point that I wrote a bash coprocess management library to handle multiple active coprocess and give convenient methods for interaction. Perhaps the trickiest bit about multiple coprocesses open at once (which I suspect is the reason support was never added to bash) is that you don't want the second and subsequent coprocesses to inherit the pipe fds of prior open coprocesses. This can result in deadlock if, for instance, you close your write end to coproc1, but coproc1 continues to wait for input because coproc2 also has a copy of a write end of the pipe to coproc1's input. So you need to be smart about subsequent coprocesses first closing all fds associated with other coprocesses.

https://lists.gnu.org/archive/html/help-bash/2021-03/msg00296.html
https://lists.gnu.org/archive/html/help-bash/2021-04/msg00136.html

Oh hey! Look at that. Thanks for the links to this thread - I gave them a read (along with the old thread from 2011-04). I feel a little bad I missed the 2021 discussion.


You're on the money, though there is a preprocessor directive you can build bash with that will allow it to handle multiple concurrent coprocesses without complaining: MULTIPLE_COPROCS=1.

Who knew! Thanks for mentioning it. When I saw that "only one active coprocess at a time" was _still_ listed in the bugs section in bash 5, I figured multiple coprocess support had just been abandoned. Chet, that's cool that you implemented it.

I kind of went all-out on my bash coprocess management library though (mostly back in 2014-2016) ... It's pretty feature-rich and pleasant to use -- to the point that I don't think there is any going-back to bash's internal coproc for me, even with multiple coprocess are support. I implemented it with shell functions, so it doesn't rely on compiling anything or the latest version of bash being present. (I even added bash3 support for older systems.)

Chet Ramey's sticking point was that he hadn't seen coprocesses used enough in the wild to satisfactorily test that his implementation did in fact keep the coproc file descriptors out of subshells.

To be fair coproc is kind of a niche feature. But I think more people would play with it if it were less awkward to use and if they felt free to experiment with multiple coprocs.

By the way, I agree with the Chet's exact description of the problems here:

    https://lists.gnu.org/archive/html/help-bash/2021-03/msg00282.html

The issue is separate from the stdio buffering discussion; the issue here is with child processes (and I think not foreground subshells, but specifically background processes, including coprocesses) inheriting the shell's fds that are open to pipes connected to an active coprocess.

Not getting a sigpipe/write failure results in a coprocess sitting around longer than it ought to, but it's not obvious (to me) how this leads to deadlock, since the shell at least has closed its read end of the pipe to that coprocess, so at least you aren't going to hang trying to read from it.

On the other hand, a coprocess not seeing EOF will cause deadlock pretty readily, especially if it processes all its input before producing output (as with wc, sort, sha1sum). Trying to read from the coprocess will hang indefinitely if the coprocess is still waiting for input, which is the case if there is another copy of the write end of its read pipe open somewhere.


If you've got examples you can direct him to, I'd really appreciate it.

[My original use cases for multiple coprocesses were (1) for programmatically interacting with multiple command-line database clients together, and (2) for talking to multiple interactive command-line game engines (othello) to play each other.

Perl's IPC::Open2 works, too, but it's easier to experiment on the fly in bash.

And in general having the freedom to play with multiple coprocesses helps mock up more complicated pipelines, or even webs of interconnected processes.]

But you can create a deadlock without doing anything fancy.


Well, *without multi-coproc support*, here's a simple wc example; first with a single coproc:

        $ coproc WC { wc; }
        $ exec {WC[1]}>&-
        $ read -u ${WC[0]} X
        $ echo $X
        0 0 0

This works as expected.

But if you try it with a second coproc (again, without multi-coproc support), the second coproc will inherit copies of the shell's read and write pipe fds to the first coproc, and the read will hang (as described above), as the first coproc doesn't see EOF:

        $ coproc WC { wc; }
        $ coproc CAT { cat; }
        $ exec {WC[1]}>&-
        $ read -u ${WC[0]} X

        # HANGS


But, this can be observed even before attempting the read that hangs.

You can 'ps' to see the user shell (bash), the coprocs' shells (bash), and the coprocs' commands (wc & cat). Then 'ls -l /proc/PID/fd/' to see what they have open:

- The user shell has its copies of the read & write fds open for both coprocs (as it should)

- The coproc commands (wc & cat) each have only a single read & write pipe open, on fd 0 & 1 (as they should)

- The first coproc's shell (WC) has only a single read & write pipe open, on fd 0 & 1 (as it should)

- The second coproc's shell (CAT) has its own read & write pipes open, on fd 0 & 1 (good), but it also has a copy of the user shell's read & write pipe fds to the first coproc (WC) open (on fd 60 & 63 in this case, which it inherited when forking from the user shell)

(And in general, latter coproc shells will have stray copies of the user shell's r/w ends from all previous coprocs.)

So, you can examine the situation after setting up coprocs, to see if all the coproc-related processes have just two pipes open (on fd 0 & 1). If this is the case, I think that suffices to convince me anyway that no deadlocks related to stray open fds can happen. But if any of them has other pipes open (inherited from the user shell), that indicates the problem.


I tried compiling the latest bash with MULTIPLE_COPROCS=1 (version 5.2.21(1)) to test out the multi-coproc support.

I tried standing up the above WC and CAT coprocs, together with some others to check that the behavior looked ok for pipelines also (which I think was one of Chet's concerns)

        $ coproc WC { wc; }
        $ coproc CAT { cat; }
        $ coproc CAT3 { cat | cat | cat; }
        $ coproc CAT4 { cat | cat | cat | cat; }
        $ coproc CATX { cat ; }

And as far as the fd situation, everything checks out: the user shell has fds open to all the coprocs, and the coproc shells & coproc commands (including all the cat's in the pipelines) have only a single read & write pipe open on fd 0 & 1. So, the multi-coproc code seems to be closing the shell's copies correctly.

[The examples are boring, but their point is just to investigate the stray-fd question.]


HOWEVER!!!

Unexpectedly, the new multi-coproc code seems to close the user shell's end of a coprocess's pipes, once the coprocess has terminated. When compiled with MULTIPLE_COPROCS=1, this is true even if there is only a single coproc:

        $ coproc WC { wc; }
        $ exec {WC[1]}>&-
        [1]+  Done                    coproc WC { wc; }

        # WC var gets cleared!!
        # shell's ${WC[0]} is also closed!

        # now, can't do:

        $ read -u ${WC[0]} X
        $ echo $X

I'm attaching a "bad-coproc-log.txt" with more detailed ps & ls output examining the open fds at each step, to make it clear what's happening.

This is a bug. The shell should not automatically close its read pipe to a coprocess that has terminated -- it should stay open to read the final output, and the user should be responsible for closing the read end explicitly.

This is more obvious for commands that wait until they see EOF before generating any output (wc, sort, sha1sum). But it's also true for any command that produces output (filters (sed) or generators (ls)). If the shell's read end is closed automatically, any final output waiting in the pipe will be discarded.

It also invites trouble if the shell variable that holds the fds gets removed unexpectedly when the coprocess terminates. (Suddenly the variable expands to an empty string.) It seems to me that the proper time to clear the coproc variable (if at all) is after the user has explicitly closed both of the fds. *Or* else add an option to the coproc keyword to explicitly close the coproc - which will close both fds and clear the variable.

...

Separately, I consider the following coproc behavior to be weird, fragile, and broken.

If you fg a coproc, then stop and bg it, it dies. Why? Apparently the shell abandons the coproc when it is stopped, closes the pipe fds for it, and clears the fd variable.

        $ coproc CAT { cat; }
        [1] 10391

        $ fg
        coproc CAT { cat; }

        # oops!

        ^Z
        [1]+  Stopped                 coproc CAT { cat; }

        $ echo ${CAT[@]}  # what happened to the fds?

        $ ls -lgo /proc/$$/fd/
        total 0
        lrwx------ 1 64 Mar 14 02:26 0 -> /dev/pts/3
        lrwx------ 1 64 Mar 14 02:26 1 -> /dev/pts/3
        lrwx------ 1 64 Mar 14 02:25 2 -> /dev/pts/3
        lrwx------ 1 64 Mar 14 02:26 255 -> /dev/pts/3

        $ bg
        [1]+ coproc CAT { cat; } &

        $
        [1]+  Done                    coproc CAT { cat; }

        $ # sad user :(


This behavior is not new to the multi-coproc support. But just the same it seems broken for the shell to automatically close the fds to coprocesses. That should be done explicitly by the user.


Word to the wise: you might encounter this issue (coproc2 prevents coproc1 from seeing its end-of-input) even though you are rigging this up yourself with FIFOs rather than bash's coproc builtin.)

In my case, it's mostly a non-issue, because I fork the - now three - background processes before exec'ing automatic fds redirecting to/from their FIFO's in the parent process. All the automatic fds get put in an array, and I do close them all at the beginning of a subsequent process substitution.

That's a nice trick with the shell backgrounding all the coprocesses before connecting the fifos. But yeah, to make subsequent coprocesses you do still have to close the copy of the user shell's fds that the coprocess shell inherits. It sounds like you are doing that (nice!), but in any case it requires some care, and as these stack up it is really handy to have something manage it all for you.

(Perhaps this is where I ask if you are happy with your solution or if you would like to try out something wildly more flexible...)


Happy coprocessing! :)

Carl

Attachment: bad-coproc-log.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]