bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] FIELDWIDTHS can miscount the number of fields


From: Wolfgang Laun
Subject: Re: [bug-gawk] FIELDWIDTHS can miscount the number of fields
Date: Sun, 21 May 2017 18:45:45 +0200

It will be really interesting to see what NF will be when FIELDWIDTHS is set to
   3 4 5
and the input record is
   aaabbbbcc

 Perhaps NF == 2.4? ;-)

Wolfgang

On 21 May 2017 at 18:29, Andrew J. Schorr <address@hidden> wrote:
Hi,

On Sat, May 20, 2017 at 11:11:18PM +0300, Arnold Robbins wrote:
> gawk can miscalculate NF when using FIELDWIDTHS to parse data:
...
> When run, we get:
>
>       $ gawk -f x.awk x.in
>       3
>       3
>       3
>
> Ooops!
>
> The following patch to gawk's master seems to fix the problem.
>
> Andy - look ok to you?

The patch looks fine to me, although I wonder whether this is really a bug. The
user specified that this is a field of fixed-width records, and we properly
give empty string values for the missing fields. Why was it specified as a
fixed-width record using the FIELDWIDTHS mechanism if that's not actually the
case? I don't really know what NF is supposed to be in such cases. Is that
defined in the docs? Will it break any existing scripts to change this
behavior? Do we need to update the docs to define clearly what happens when the
input record is shorter than expected from the value implied by FIELDWIDTHS?
And then there's the related question of what should happen when the record
is longer than the value implied by FIELDWIDTHS? This also relates to the
suggestion of adding a "*" special character for parsing extra data.
In other words, this issue seems like a can of worms to me.

So, I don't object to applying that patch, but perhaps this general issue of
how to handle fixed-width records that are not the expected size requires more
thought...

Regards,
Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]