bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Fwd: why does referencing NF or different fields change $


From: Nethox
Subject: Re: [bug-gawk] Fwd: why does referencing NF or different fields change $0 after recompilation?
Date: Fri, 18 Sep 2015 14:06:50 +0200

The bug is reproducible with GNU Awk 4.0.1 .

Manual page gawk(1), section Fields:
"References to non-existent fields (i.e. fields after $NF) produce the null-string."
"Assigning a value to an existing field causes the whole record to be rebuilt when $0 is referenced."
I think the bug is in some optimization to avoid useless field splitting prior to printing $0. It is confusing "non-existent fields" with "non-referenced but existent fields".

Referencing NF or a greater non-existent field (like $7) works correctly because the complete field splitting is required, which avoids the optimization and its bug.
It does not matter what field number you reassign to provoke the $0 recomputation, it is just that the right side of an assignment is also an _expression_ where to put the workaround reference:
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{ NF = NF } 1' file
"A";"B";"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{ $NF = $NF } 1' file
"A";"B";"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{ $7; $1 = $1 } 1' file
"A";"B";"C"

These tests show that $0 is recomputed in between expressions, not only full sentences or action blocks. Also, that the bug increases NF by inserting a null-string field at the next position where the optimization stops the splitting:
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{ $1=$1; print NF "  " $0 }' file
3  "A";"B";"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{ $1=$1; print $0 "  " NF "  |" $2 "|" }' file
"A";;"B";"C"  4  ||

Regards.


On Thu, Sep 17, 2015 at 3:04 PM, Ed Morton <address@hidden> wrote:
I got a couple of responses to my question below on comp.lang.awk and I'm now pretty confident it's a bug and not my misunderstanding of something.

Josef Frank pointed out that:

Reminds me of a bug (in gawk 4.0.0) mentioned at the top of:
http://git.savannah.gnu.org/cgit/gawk.git/tree/test/pty1.awk

Can you take a look?

        Ed.

-------- Forwarded Message --------
Subject: why does referencing NF or different fields change $0 after recompilation?
Date: Wed, 16 Sep 2015 22:15:05 -0500
From: Ed Morton <address@hidden>
Organization: A noiseless patient Spider
Newsgroups: comp.lang.awk

This GNU awk script is intended to replace commas with semi-colons in a CSV file
that could contain commas in the quoted fields and could have blank fields:

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file

Can anyone explain why the first call to awk below replaces the first comma with
two semi-colons while the second and third (which are only different from the
first in that they mentions NF somewhere in the action block) replace it with
one, which is the desired result?

$ cat file
"A","B","C"

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
"A";;"B";"C"

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{NF;$1=$1}1' file
"A";"B";"C"

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1;NF}1' file
"A";"B";"C"

Mentioning some other variables has no effect:

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{OFS;$1=$1}1' file
"A";;"B";"C"

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$0;$1=$1}1' file
"A";;"B";"C"

while just mentioning $1/2/3 can change where the double-semi-colon appears:

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1;$1=$1}1' file
"A";;"B";"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$2;$1=$1}1' file
"A";"B";;"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$3;$1=$1}1' file
"A";"B";"C"

and (presumably related) we get a different $0 depending on which field we
assign to itself:

$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$1=$1}1' file
"A";;"B";"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$2=$2}1' file
"A";"B";;"C"
$ awk -v FPAT='([^,]*)|("[^"]+")' -v OFS=';' '{$3=$3}1' file
"A";"B";"C"

I'm using

$ awk --version
GNU Awk 4.1.3, API: 1.1 (GNU MPFR 3.1.3, GNU MP 6.0.0)

in bash on cygwin.

Regards,

        Ed.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]