bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] gawk4 split() function bug? or feature?


From: Aharon Robbins
Subject: Re: [bug-gawk] gawk4 split() function bug? or feature?
Date: Fri, 22 Mar 2013 10:25:20 +0200
User-agent: Heirloom mailx 12.5 6/20/10

Hi.

> From: Kent <address@hidden>
> Date: Tue, 19 Mar 2013 23:54:44 +0100
> To: address@hidden
> Subject: [bug-gawk] gawk4 split() function bug? or feature?
>
> Hi there,
>
> recently I found something strange about the split() function of gawk,
> I am not sure if it is a bug, it would be good if you guys could
> explain a bit. Thanks in advance.

It's a bit of a dark corner.

> my gawk version
> kent$  gawk --version
> GNU Awk 4.0.2
>
> I know that the 3rd parameter of split() function is a regex. but take
> a look these examples:
>
> kent$  echo "foo.bar.baz" | awk '{split($0,a,"."); print "length of
> a:" length(a);for (x in a) print a[x]}'
> length of a:3
> foo
> bar
> baz

The third argument can also be a string.  It is then treated like
the value of FS, where if the value is a single character (even if that
character is a regex metacharacter) it acts as the separator. Once it
is longer than a single character, it is treated as a dynamic regexp.

> split() looks "." as literature "dot", same as /[.]/ or /\./
> but if I do:
> kent$  echo "foo.bar.baz" | awk '{split($0,a,/./); print "length of
> a:" length(a);for (x in a) print a[x]}'
> length of a:12
> (here we have 12 emplty lines)

Here, /./ is a regexp constant, so gawk knows unequivically that it should
treat the period as metacharacter.  Other awks work this way also, not
just gawk.

It can be confusing, I admit. The language has more dark corners like this
than one would like, but that's the way it is. :-)

Hope this helps,

Arnold



reply via email to

[Prev in Thread] Current Thread [Next in Thread]