|
From: | Manuel Collado |
Subject: | Re: CSV extension status |
Date: | Mon, 17 May 2021 23:44:56 +0200 |
User-agent: | Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 |
El 17/05/2021 a las 16:00, Andrew J. Schorr escribió:
Hi, On Mon, May 17, 2021 at 03:44:10PM +0200, Manuel Collado wrote:The gawk-csv extension pushes the gawk API for input parsers to its limits. And uncover some nasty limitations. An input parser can deliver a record composed of fields, but there is no way to control further reparsing of the record after assigning new values to $0..$NF.Can you please elaborate on what you have in mind with regards to "further reparsing of the record"? Do you have an example that demonstrates the problem?
A record is parsed when read from an input file. And also after assigning $0 = "new value". The API allows a custom input parser do the first, but not the second.
For instance, a standard way of prepending a field to the current record would be:
$0 = "new field" OFS $0For CSV fields and records this construction only works if FPAT and OFS have the appropriate values. But the API doesn't allow the extension to silently assign values to the predefined variables. This must be explicitly done in the user gawk code.
And things are even worse if the record syntax can not be parsed with the supported FS/FPAT/FIELDWIDTHS modes.
A naive approach would be to let the API offer a hook that allows a custom input parser to fully override the internal gawk record parser. But this possibility require a careful consideration.
Hope this clarify things. I'm ready to further explain my goals, if you like.
Regards. -- Manuel Collado - http://mcollado.z15.es
[Prev in Thread] | Current Thread | [Next in Thread] |