bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Tentative CSV extension - please advise


From: Andrew J. Schorr
Subject: Re: [bug-gawk] Tentative CSV extension - please advise
Date: Mon, 14 Mar 2016 19:56:57 -0400
User-agent: Mutt/1.5.23 (2014-03-12)

On Mon, Mar 14, 2016 at 11:14:20PM +0100, Manuel Collado wrote:
> I'm considering to develop a CSV read/write extension. Currently in a very
> preliminary stage. Just reading specs, api docs, collecting csv libraries
> for a variety of languages, etc.

Cool.

> In order to organize things in an adequate way, please confirm, correct or
> comment the following points:
> 
> 1.- The extension should provide CSV field values as $1-$NF

That sounds right.

> 2.- The current extension api only allows to provide a string value for $0,
> but not individual field values. $0 is split into fields by the gawk core
> according to the current FS.

I took a look at the awk_input_buf_t API, and you are right. And it has to be
this way, since the file could also be read using getline into a variable, in
which case the notion of $1-$NF does not apply.

> 3.- A possible solution would be to forge a $0 record by joining the csv
> fields with a customizable ad-hoc field separator, and temporarily set FS to
> that separator value.

Ugh. If there were a string that is guaranteed never to appear in a CSV
file, then we could use that for FS. If such a string does not exist, then
the only clean way to solve this problem seems to be to modify core gawk
to add a CSV field-splitting capability.

> 6.- Help about how to initially setup a gawkextlib extension development
> directory will be really appreciated.

If you clone the gawkextlib tree, it contains a script named
"make_extension_directory.sh" at the top level. It sets up a skeleton
extension and tells you what to do next. Here's the help message:

   bash-4.2$ ./make_extension_directory.sh --help
   ./make_extension_directory.sh: illegal option -- -

   Usage: make_extension_directory.sh [-g <path to gawk>] [-l <path to 
gawkextlib>] <new extension directory name> <author name> <author email address>

           Configures a new directory for adding an extension.  This installs
           the boilerplate stuff so you can focus on writing code, 
documentation,
           and test cases.

           Options:
                   -g      Specify path to your gawk installation, in case
                           it's in a nonstandard place.
                   -l      Specify path to your gawkextlib installation, in case
                           it's in a nonstandard place.

So that part should be easy. The problem is the CSV field parsing. I hadn't
considered that issue.

Regards,
Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]