coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: line buffering in pipes


From: William Bader
Subject: Re: line buffering in pipes
Date: Fri, 3 May 2019 04:13:14 +0000

gnu parallel has options to keep the output from commands from being scrambled. 
For example, -k keeps the output in the same order as the input job list. 
http://www.gnu.org/software/parallel/

https://www.gnu.org/software/parallel/man.html#EXAMPLE:-Keep-order-of-output-same-as-order-of-input

Regards, William



________________________________
From: coreutils <coreutils-bounces+williambader=address@hidden> on behalf of 
Egmont Koblinger <address@hidden>
Sent: Thursday, May 2, 2019 6:07 PM
To: Assaf Gordon
Cc: Denys Vlasenko; address@hidden
Subject: Re: line buffering in pipes

Hi Assaf,

Thanks a lot for this amazing overview!

> http://pubs.opengroup.org/onlinepubs/9699919799/functions/setvbuf.html :
>
>      setvbuf - assign buffering to a stream
>      [...]
>      Applications should note that many implementations only provide line
>      buffering on input from terminal devices.

I _think_ the correct way to parse this sentence is:
"many implementations only provide" – what: "line buffering on input"
– when/where: "from terminal devices"

rather than:
"many implementations only provide" – what: "line buffering" –
when/where: "on input from terminal devices"

in which case it's irrelevant for us since we're talking about the
output's buffering.

> 1.
> To the best of my understanding, glibc and musl-libc both
> implement line-buffering for all streams, not just terminals.

A bit of clarification: As for the output, I'm pretty sure they
implement support for it for all kinds of destinations, not just
terminals. According to its manual setbuf() confirms to C89, so it's
thirty year old stuff. However, this mode is the default only if the
output is connected to a terminal, otherwise block buffering is the
default – this is what stdbuf overrides.

> POSIX requires that a write(2) to a FIFO is atomic
> if the amount of data written is less than PIPE_BUF:

This is what I totally missed, and incorrectly believed that any
write() to a pipe (well, any write() of at least 2 bytes) could be
split. This lead me to claim that your proposed solution was still not
robust.

> Given all the above, is the following "robust enough" ?
>
>    find [DIRECTORY] | xargs -P99 stdbuf -oL [PROG1] | [PROG2]
>
> I think the answer to is "it's complicated :)"
>
> [...]
>
> So my solution of "stdbuf" would *not* be portably robust for all
> systems. For GNU/Linux it should work fine.
>
> For other systems, lines longer than MIN(PIPE_BUF,BUFSIZ)
> will break atomicity, and output will be interleaved/mangled.

I fully agree with your conclusion.

(Still, I'd prefer to design the script so that it doesn't rely on
these constants being big enough on GNU/Linux, but instead would work
with any values.)


thanks a lot,
egmont



reply via email to

[Prev in Thread] Current Thread [Next in Thread]