help-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: inconstancy with RS = "(\r?\n){2}"


From: Ed Morton
Subject: Re: inconstancy with RS = "(\r?\n){2}"
Date: Sun, 25 Jul 2021 07:03:43 -0500
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0



On 7/25/2021 7:01 AM, Alex fxmbsw7 Ratchev wrote:
i noticed on the terminal variant it processes after couple of \ns more
but thats still not good enuff for tcp httpd connections

You are correct, I stopped hitting enter too soon:

------------
$ gawk -v RS='(\r?\n){2}' '{print "<"$0":"RT">"}'










<:

>
------------


On Sun, Jul 25, 2021, 13:55 Alex fxmbsw7 Ratchev <fxmbsw7@gmail.com> wrote:

thank you for the true and detailed analyzement

On Sun, Jul 25, 2021, 13:49 Ed Morton <mortoneccc@comcast.net> wrote:


On 7/25/2021 4:47 AM, arnold@skeeve.com wrote:

Greetings.

Thank you for taking the time to make a bug report. In the future please
send a concise description of the problem with a test program and data.
It was hard for me to determine what you really think is the bug.

It looks like your concern is with the need to enter EOF more than
once from the terminal.

Gawk is designed mainly for batch processing (from files or a pipe).
Reading from a terminal with a complicated regexp as RS isn't the
normal use case.  When RS is a regexp gawk may have to do lookahead in
the input stream to be sure that the regexp has matched, and thus
the need for multiple EOFs.

In any case, I don't think there is an actual bug:

$ od -c data
0000000   a  \n  \n  \n   b  \n  \n  \n  \n   c  \n  \n  \n  \n   d  \n
0000020
$ ./gawk -v RS='(\r?\n){2}' -v ORS='|\n' '{ print }' < data
a|

b|
|
c|
|
d
|

This looks right to me.

Thanks,

Arnold



The problem occurs when reading from a terminal:

Good (no \r? in RS), every pair of `\n`s is recognized:
------------
$ gawk -v RS='(\n){2}' '{print "<"$0":"RT">"}'



<:


<:


<:

-----------------

Bad (with \r? in RS), no RS is every recognized:
--------------
$ gawk -v RS='(\r?\n){2}' '{print "<"$0":"RT">"}'






-------------------

Meanwhile if the input was coming from a pipe the RS including `\r?`
would be recognized:
---------
$ printf '\n\n\n\n\n' | gawk -v RS='(\r?\n){2}' '{print "<"$0":"RT">"}'
<:

<:

<
:>
-----------

Regards,

     Ed.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]