bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6277: cut: Please add CSV parsing


From: Bob Proulx
Subject: bug#6277: cut: Please add CSV parsing
Date: Thu, 27 May 2010 09:27:00 -0600
User-agent: Mutt/1.5.18 (2008-05-17)

retitle 6277 cut: Please add CSV parsing
tags 6277 + wishlist wontfix
thanks

sandy bas wrote:
> Comma delimited files often have fields of the form "big,black,bear"
> where the commas within the quotes are not delimiters. A useful
> option in cut would be to ignore the commas (delimiters) within the
> quotation marks.
> 
> I would be glad to put it in if you would like the option.

Parsing CSV files is deceptively more complicated than is looks.

Using the Perl Text::CSV module as a guide shows that the result would
add several thousand lines of code.  This would fall under the
category of creeping featurism and code bloat because it would
significantly enlarge the code base of the 'cut' program well beyond
its traditional role as a simple cut by field program.

And if CSV parsing is allowed in then wouldn't by comparison other
file format parsing be allowed in as well?  Plus the coreutils are the
core utilities that belong on every machine in the universe.  Does my
toaster need this capability?  Large items like this really should go
into a differently named program.  It isn't just the use of the
program on a fully loaded desktop but also the use of the program
across the entire universe of machines.

I am sorry but full CSV parsing really doesn't belong in cut.

I suggest that you use Perl, Python or Ruby for CSV processing.  They
include full libraries for dealing with the many varied details of CSV
handling.

Something like the following is a simple example perl script to print
only the second field of a CSV file.

  #!/usr/bin/env perl
  use Text::CSV;
  use strict;
  my $csv = Text::CSV->new;
  foreach my $filename (@ARGV) {
      open(CSV,$filename) or die "Error parsing $filename: $!\n";
      while (defined($_ = <CSV>)) {
      if (! $csv->parse($_)) {
          die("Error parsing: " . $csv->error_input);
        }
        print(($csv->fields())[1],"\n");  # print second field
      }
  }

Bob





reply via email to

[Prev in Thread] Current Thread [Next in Thread]