help-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: inconstancy with RS = "(\r?\n){2}"


From: Alex fxmbsw7 Ratchev
Subject: Re: inconstancy with RS = "(\r?\n){2}"
Date: Sun, 25 Jul 2021 14:01:27 +0200

i noticed on the terminal variant it processes after couple of \ns more
but thats still not good enuff for tcp httpd connections

On Sun, Jul 25, 2021, 13:55 Alex fxmbsw7 Ratchev <fxmbsw7@gmail.com> wrote:

> thank you for the true and detailed analyzement
>
> On Sun, Jul 25, 2021, 13:49 Ed Morton <mortoneccc@comcast.net> wrote:
>
>>
>>
>> On 7/25/2021 4:47 AM, arnold@skeeve.com wrote:
>>
>> Greetings.
>>
>> Thank you for taking the time to make a bug report. In the future please
>> send a concise description of the problem with a test program and data.
>> It was hard for me to determine what you really think is the bug.
>>
>> It looks like your concern is with the need to enter EOF more than
>> once from the terminal.
>>
>> Gawk is designed mainly for batch processing (from files or a pipe).
>> Reading from a terminal with a complicated regexp as RS isn't the
>> normal use case.  When RS is a regexp gawk may have to do lookahead in
>> the input stream to be sure that the regexp has matched, and thus
>> the need for multiple EOFs.
>>
>> In any case, I don't think there is an actual bug:
>>
>> $ od -c data
>> 0000000   a  \n  \n  \n   b  \n  \n  \n  \n   c  \n  \n  \n  \n   d  \n
>> 0000020
>> $ ./gawk -v RS='(\r?\n){2}' -v ORS='|\n' '{ print }' < data
>> a|
>>
>> b|
>> |
>> c|
>> |
>> d
>> |
>>
>> This looks right to me.
>>
>> Thanks,
>>
>> Arnold
>>
>>
>>
>> The problem occurs when reading from a terminal:
>>
>> Good (no \r? in RS), every pair of `\n`s is recognized:
>> ------------
>> $ gawk -v RS='(\n){2}' '{print "<"$0":"RT">"}'
>>
>>
>>
>> <:
>>
>> >
>>
>>
>> <:
>>
>> >
>>
>>
>> <:
>>
>> >
>> -----------------
>>
>> Bad (with \r? in RS), no RS is every recognized:
>> --------------
>> $ gawk -v RS='(\r?\n){2}' '{print "<"$0":"RT">"}'
>>
>>
>>
>>
>>
>>
>> -------------------
>>
>> Meanwhile if the input was coming from a pipe the RS including `\r?`
>> would be recognized:
>> ---------
>> $ printf '\n\n\n\n\n' | gawk -v RS='(\r?\n){2}' '{print "<"$0":"RT">"}'
>> <:
>>
>> >
>> <:
>>
>> >
>> <
>> :>
>> -----------
>>
>> Regards,
>>
>>     Ed.
>>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]