bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Computed regex and getline bug / issue


From: Davide Brini
Subject: Re: [bug-gawk] Computed regex and getline bug / issue
Date: Sun, 4 May 2014 18:09:01 +0200

On Sun, 4 May 2014 17:31:55 +0800, Grail Dane <address@hidden> wrote:

> Hello
> As part of an exercise in displaying data from a file I have come across
> an issue which neither myself nor any of thegood people at
> linuquestions.org have been able to solve and believe it may be a bug
> within gawk. Using the following data as an input file: 1 , 23 , 45 ,
> 67 , 89 , 10 In case this does not display correctly, the format is -
> number space comma space number Using the following basic gawk we are
> able to return data as follows: $ awk '{print "|"$0"|"}' RS='[,\n]'
> file|1 || 2||3 || 4||5 || 6||7 || 8||9 || 10| Pipes included to simply
> show white space. If we then use getline prior to our print we receive: $
> awk '{getline;print "|"$0"|"}' RS='[,\n]' file| 2|| 4|| 6|| 8|| 10| Which
> again is all fine, however, if we then extend the RS computed regex to
> allow for spaces, our original output is the same but minus the spaces: $
> awk '{print "|"$0"|"}' RS='[,\n ]+' file|1||2||3||4||5||6||7||8||9||10|
> Again, as expected.  Once we go back to our getline version where we
> expect to return every second record, we now see our 'bug': $ awk
> '{getline;print "|"$0"|"}' RS='[,\n ]+' f2|2||4||6||8||9|   <-- This
> should have been |10| The thread for further discussion on this issue can
> be found here :-
> http://www.linuxquestions.org/questions/programming-9/peculiar-awk-behaviour-confusing-me-4175503599/
> Please advise if you should require any further information? CheersGrail
>                                         


I have been able to reduce the behavior to these simple test cases, which
(unless I'm missing something obvious) should behave identically but don't:

$ printf '1,2,' | gawk 'BEGIN{RS="[,]"}{print; a = getline; print "-"a"-"; 
print}'
1
-1-
2

$ printf '1,2,' | gawk 'BEGIN{RS="[,]+"}{print; a = getline; print "-"a"-"; 
print}'
1
-0-
1

That is, in the second case getline detects EOF and does not update $0.

Curiously, removing the square brackets works:

$ printf '1,2,' | gawk 'BEGIN{RS=","}{print; a = getline; print "-"a"-"; print}'
1
-1-
2

$ printf '1,2,' | gawk 'BEGIN{RS=",+"}{print; a = getline; print "-"a"-"; 
print}'
1
-1-
2


-- 
D.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]