[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CSV extension status

From: Manuel Collado
Subject: Re: CSV extension status
Date: Tue, 18 May 2021 23:38:42 +0200
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

El 18/05/2021 a las 17:33, Andrew J. Schorr escribió:
I downloaded the tgz and zip files, but they seem to be missing the main code,
or I am losing my mind:

Your mind is sane. After correcting some errata in the docs the tarball were recreated, the wrong way!

OK, I took a look at the code, and I get the idea. My first thought is
to wonder whether you should use a namespace to scope the proliferation
of variables and functions. There are a ton of hidden variables and
functions that should probably be isolated.

Well, I'm not a fan of namespaces. What would really help is a true modular gawk code, with automatic global/local scopes. A common names prefix reduces the risk of name clashes. Like in gawk-xml.

That being said, this solution strikes me as much more complicated
and likely to be much slower than an input parser implemented in C.

Ok. I'll benchmark gawk-csv vs. CSVMODE.

For those who want simple, read-only access to CSV documents, my gut instinct
is that an input parser library would be a better and more robust solution.
In particular, the splitting and reconstruction of the record with OFS
seems a bit slow and fragile to me.

If your code never rewrites the data this reconstruction will never take place. And if it does, the reconstruction is certainly done in the gawk core, not in the extension. Do you think the gawk core is slow and fragile? ;-)

I really just want to be able to
say gawk -lcsv and not have to worry about configuring all of the
CSV* variables correctly.

The sample code in the previous message doesn't configure any control variable. Both gawk-csv and CSVMODE try to make CSV processing as much awk-ish as possible.

Please post sample cases which you think are challenges, and I will try to give simple solutions.


Manuel Collado - http://mcollado.z15.es

reply via email to

[Prev in Thread] Current Thread [Next in Thread]