[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: feature: expose POSITION of parsed columns (fields) as a variable/fu

From: arnold
Subject: Re: feature: expose POSITION of parsed columns (fields) as a variable/function?
Date: Fri, 14 May 2021 05:22:07 -0600
User-agent: Heirloom mailx 12.5 7/5/10


Gawk does not store the information as to where fields start and end
within a record. Doing so to make it available in an array would be
very expensive in terms of processing time to update an array on
every input record.

In addition, gawk parses the input record lazily. If a record has 872 fields
and a program only asks for $5, it only parses the record up to $5.
Requiring that an FPOS array (for example) be available would require
fully parsing the record every time.

You can use the split() function with a fourth array argument to get the
separator strings and then compute such an array yourself in a
user-level function without much difficulty.

Vla D <dubovo@gmail.com> wrote:

> OffTopic1: if awk would've allowed to treat strings as
> pointers-to-first-char - we could've just calculated `1 + $3 - $0` (one
> plus mem address of 8th char minus mem address of 1st char) = 8, but this
> is from non-awk universe

Quite. $3 already has a well defined meaning. And I'm not about to
add unary & to the language.

> OffTopic2: if split($0,flds,FS,seps) could've been made "lazy", e.g. to
> only do the actual parsing only up to the field at the moment when we use
> the filed (flds[3]) - this might've added enough performance to the
> workarounds of original issue, BUT this sounds waaay more complex to
> implement than just exposing the desired value as a function....

There's no real way to make this lazy; adding another argument that
says "only go to field N" would further complicate code and documentation
and isn't worth it, IMHO.

Of course, as is always the case, the code is Free Software, and you
are welcome to modify a private copy for yourself.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]