bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] FIELDWIDTHS can miscount the number of fields


From: Andrew J. Schorr
Subject: Re: [bug-gawk] FIELDWIDTHS can miscount the number of fields
Date: Sun, 21 May 2017 15:46:52 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

On Sun, May 21, 2017 at 10:12:44PM +0300, Arnold Robbins wrote:
> Q1. Given FIELDWIDTHS = "2 3 4" and input data "aabb". How many fields
>    should there be?
>    A. Two, since that's all the data that's there
>    B. Three, with $3 == "", since it's supposed to be all fixed width data
> 
> A1. Gawk currently says three. Arnold leans towards two, since it reflects
>     the actual data and allows code expecting three fields to weed out
>     bad records.

I agree.

> Q2. Given FIELDWIDTHS = "2 3 4" and input data "aab", should $2 have a
>     value?
>     A. No - we're expecting three characters and they weren't all there
>     B. Yes - something was there, make it available
> 
> A2. Gawk currently says "yes".  Arnold isn't sure what's right here.
>     Input is welcome.

I agree with current behavior (B).

> Q3. Given FIELDWIDTHS = "2 3 4" and input data "aabbbccccddd" what should
>     be done with the dddd?
>     A. Nothing - it's extra, ignore it. NF should be set to 3. Code that
>        wants to know if there's something extra can use length() and
>        substr() to get it out of the record.
>     B. Stick it into $4 anyway.
> 
> A3. Arnold and gawk agree on (A).

Since we plan to add support for trailing "*" as in Q4 below, I would
choose the approach that is easiest to implement. I think that's probably A,
since that's what we do now. Those who are interested in trailing data
can use "*".

> Q4. Given the idea that using "*" at the end of FIELDWIDTHS to mean
>     anything else, then with FIELDWIDTHS = "2 3 4 *", and input
>     data "aabbbccccdddd" the dddd would go into $4. The final data
>     would be optional.  Is there any reason not to add this to gawk?
>     It seems to be actually useful and not just theoretically useful.
> 
> A4. Arnold thinks it's right to add it.

Agreed. I presume that NF will be 3 if the record length is 9 and 4 for
10 or longer.

Regards,
Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]