bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Computed regex and getline bug / issue


From: Andrew J. Schorr
Subject: Re: [bug-gawk] Computed regex and getline bug / issue
Date: Sun, 4 May 2014 17:55:11 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

On Sun, May 04, 2014 at 06:09:01PM +0200, Davide Brini wrote:
> Curiously, removing the square brackets works:
> 
> $ printf '1,2,' | gawk 'BEGIN{RS=","}{print; a = getline; print "-"a"-"; 
> print}'
> 1
> -1-
> 2
> 
> $ printf '1,2,' | gawk 'BEGIN{RS=",+"}{print; a = getline; print "-"a"-"; 
> print}'
> 1
> -1-
> 2

I haven't quite figured it out yet, but I have some clues.

It is strangely a question of the length of the RS regexp relative
to the number of remaining characters in the input and whether the regexp
contains one of '+ * ? |'.

It seems to relate to the logic in io.c:rsrescan.  When that function
returns TERMNEAREND instead of REC_OK, that appears to trigger the bug.

This is OK:

bash-4.2$ printf '1,2,' | gawk 'BEGIN{RS=",+"}{printf "[%s] [%s]\n", $0, RT; a 
= getline; print "-"a"-"; printf "[%s] [%s]\n",$0, RT}'
[1] [,]
-1-
[2] [,]

This is broken:

bash-4.2$ printf '1,2,' | gawk 'BEGIN{RS="[,]+"}{printf "[%s] [%s]\n", $0, RT; 
a = getline; print "-"a"-"; printf "[%s] [%s]\n",$0, RT}'
[1] [,]
-0-
[1] [,]

This is also broken:

bash-4.2$ printf '1,2,' | gawk 'BEGIN{RS=",a?"}{printf "[%s] [%s]\n", $0, RT; a 
= getline; print "-"a"-"; printf "[%s] [%s]\n",$0, RT}'
[1] [,]
-0-
[1] [,]

I don't understand the interaction between io.c:get_a_record and io.c:rsrescan 
well
enough to see what's going wrong.  If I comment out the line in rsrescan that
returns TERMNEAREND, it seems to fix this problem.  But I assume it must
break something else.

The potential patch is attached.

To my surprise, "make check" passes with this patch.  But there must be some
reason for returning TERMNEAREND.  Does anybody have any insight into 
the logic here?  Why is TERMNEAREND useful?

Regards,
Andy

Attachment: rsrescan.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]