[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Problem with printing 5000 lines to a coprocess
From: |
Hermann Peifer |
Subject: |
Re: [bug-gawk] Problem with printing 5000 lines to a coprocess |
Date: |
Mon, 22 Dec 2014 12:24:29 -0200 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 |
Thanks again for the explanations.
On 2014-12-21 12:50, Andrew J. Schorr wrote:
Hmmm. The "pty" trick is just a way to solve the flushing problem.
Which is exactly the one I wanted to solve, as the flushing problem made
my initial code (send 1 line, then read 1 result) either hang or
terribly slow (after forcing the coprocess to flush its output via close().
Another "trick" would be to use a 2-way pipe and request line-buffering
like this, as I learned from [0]:
command = "stdbuf -o L " subprogram
I am adding a brief description of my "odyssey" below. Maybe it is of
use for someone else running into a similar issue.
Hermann
[0]
Unix buffering delays output to stdout, ruins your day
http://www.turnkeylinux.org/blog/unix-buffering
The idea: send 1 line of data via a 2-way pipe to a separate programme
for processing, then read the resulting line. In my case, the programme
is the geod utility from PROJ.4 library, which expects lat1 lon1 lat2
lon2 as input and returns azimuths and distance between the 2 points. My
initial code was:
[1]
command = "geod -I +ellps=WGS84"
for (...) {
print one_data_line |& command
command |& getline one_result_line
...
}
close(command)
The above code hangs after printing the 1st line, as the coprocess does
not flush its output. After some trial and error, I changed the code to:
[2]
for (...) {
print one_data_line |& command
close(command, "to")
command |& getline one_result_line
...
close(command, "from")
}
The above works, but is terribly sloooow, for obvious reasons, so I
changed the code to "send all data first, then read all results":
[3]
for (...) {
print one_data_line |& command
}
close(command, "to")
while ((command |& getline one_result_line) > 0) {
...
}
close(command, "from")
The above code seemed to work fine when sending say: 1000 lines. It did
however hang after sending some 4000+ lines, due to the "output buffer
is full" problem. So I changed to the tempfile option:
[4]
tempfile = ("mydata." PROCINFO["pid"])
command = "geod -I +ellps=WGS84 > " tempfile
# Write the data for processing
while (not done with data)
print data | command
close(command)
# Read the results, remove tempfile when done
while ((getline one_result_line < tempfile) > 0)
...
close(tempfile)
system("rm " tempfile)
The above worked fine and fast as far as I can tell, but the manual
tells me that this is not elegant and I should use a two-way
communication with a coprocess instead. So I went back to where I came
from and fixed the "output buffer is not flushed" problem like this:
[5a]
command = "geod -I +ellps=WGS84"
PROCINFO[command, "pty"] = 1
for (...) {
print one_data_line |& command
command |& getline one_result_line
...
}
close(command)
[5b]
command = "stdbuf -o L geod -I +ellps=WGS84"
for (...) {
print one_data_line |& command
command |& getline one_result_line
...
}
close(command)
As mentioned in the manual: Option 5a is somewhat slower than 5b, around
20% in my code.
- [bug-gawk] Problem with printing 5000 lines to a coprocess, Hermann Peifer, 2014/12/20
- Re: [bug-gawk] Problem with printing 5000 lines to a coprocess, Andrew J. Schorr, 2014/12/20
- Re: [bug-gawk] Problem with printing 5000 lines to a coprocess, Hermann Peifer, 2014/12/20
- Re: [bug-gawk] Problem with printing 5000 lines to a coprocess, Andrew J. Schorr, 2014/12/20
- Re: [bug-gawk] Problem with printing 5000 lines to a coprocess, Hermann Peifer, 2014/12/20
- Re: [bug-gawk] Problem with printing 5000 lines to a coprocess, Andrew J. Schorr, 2014/12/21
- Re: [bug-gawk] Problem with printing 5000 lines to a coprocess,
Hermann Peifer <=
- Re: [bug-gawk] Problem with printing 5000 lines to a coprocess, Andrew J. Schorr, 2014/12/22
- Re: [bug-gawk] Problem with printing 5000 lines to a coprocess, Aharon Robbins, 2014/12/24
- Re: [bug-gawk] Problem with printing 5000 lines to a coprocess, Andrew J. Schorr, 2014/12/24