[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Parse CVS in awk
From: |
Carl Friedberg |
Subject: |
RE: Parse CVS in awk |
Date: |
Fri, 10 Apr 2020 13:54:28 +0000 |
Manuel,
Thank you for that link, you have expanded my horizons! And, Tim Berners Lee is
in on this.
Best wishes,
Carl Friedberg
President, Comet &Company
165 William St New York NY
Office (212) 233-5470
Cell (917) 861-7819
-----Original Message-----
From: bug-gawk <bug-gawk-bounces+friedberg=address@hidden> On Behalf Of Peter
Brooks
Sent: Friday, April 10, 2020 12:53 AM
To: Manuel Collado <address@hidden>
Cc: bug-gawk <address@hidden>
Subject: Re: Parse CVS in awk
You might find this a useful tool:
https://colin.maudry.com/csvtool-manual-page/
Sent from my iPad
> On 9 Apr 2020, at 18:53, Manuel Collado <address@hidden> wrote:
>
> El 09/04/2020 a las 17:00, Manuel Collado escribió:
>>> El 09/04/2020 a las 4:51, Peng Yu escribió:
>>> I'm wondering if the solution mentioned here is robust against all
>>> CVS format variations.
>>>
>>> https://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Cont
>>> ent
>
> This manual says:
>
> <quote>
> NOTE: Some programs export CSV data that contains embedded newlines between
> the double quotes. gawk provides no way to deal with this. Even though a
> formal specification for CSV data exists, there isn’t much more to be done;
> the FPAT mechanism provides an elegant solution for the majority of cases,
> and the gawk developers are satisfied with that.
> <endquote>
>
> Well, there is a trick that can handle fields with embedded newlines. The
> idea is to join lines until the number of quotes is an even number. And amend
> NR and FNR if necessary:
>
> # Process CSV input records with embedded newlines {
> # Collect multi-line data, if it is the case
> CSVRECORD = $0
> while (gsub("\"", "\"", CSVRECORD) % 2 == 1 && (_csv_multi = getline
> _csv_) > 0) {
> CSVRECORD = CSVRECORD "\n" _csv_
> NR--
> FNR--
> }
> if (_csv_multi) {
> $0 = CSVRECORD
> }
> }
>
> HTH.
> --
> Manuel Collado - http://mcollado.z15.es
>
- Parse CVS in awk, Peng Yu, 2020/04/08
- Re: Parse CVS in awk, Wolfgang Laun, 2020/04/09
- Re: Parse CVS in awk, Manuel Collado, 2020/04/09
- Re: Parse CVS in awk, Manuel Collado, 2020/04/09
- Re: Parse CVS in awk, Peter Brooks, 2020/04/10
- RE: Parse CVS in awk,
Carl Friedberg <=
- Re: Parse CVS in awk, Manuel Collado, 2020/04/10
- Re: Parse CVS in awk, Peter Brooks, 2020/04/10
- RE: Parse CVS in awk, pjfarley3, 2020/04/11
- Re: Parse CVS in awk, Peter Brooks, 2020/04/11
Re: Parse CVS in awk, arnold, 2020/04/09