bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ADD TO FAQ] : cut : using space as input delimiter


From: Bob Proulx
Subject: Re: [ADD TO FAQ] : cut : using space as input delimiter
Date: Tue, 11 Mar 2008 14:22:55 -0600
User-agent: Mutt/1.5.13 (2006-08-11)

seb_kramm wrote:
> I have search about an hour on how to tell 'cut' to use space as input 
> delimiter, finally found out alone ! Below is the question I was about to 
> post, and the solution. I suggest adding this "trick" to the FAQ, or even 
> to the manual.

I will add something to the FAQ since this has come up a few times.

> I have a generated numeric data file, to be plotted using gnuplot. The 
> fields are separated with spaces. Before plotting, I need to do some basic 
> line/text processing, using 'cut' and 'join'.

'cut' isn't the best tool to cut fields.  'awk' is better.

> # data
> 1  1.1  1.3  1.4
> 2  1.2  1.5  1.6
> ...
> 
> Unfortunatly, I can't manage to use the space character as input separator 
> for 'cut'. I read the coreutils FAQ, the manual (5.3.0), p.43, but I 
> understand (and tried) that this is not possible.

The problem is that the fields are separated by TWO spaces not one
space.  This means that in the above there are SEVEN fields not four.
Field 2 that you want to cut is an empty field.  You would want field
3 instead.

> Quote the whole 'd' option:

Yes.  Or quote just the delimiter.  Either way is exactly the same to
the shell that is parsing the strings.  Most GNU and Unix folks would
probably just quote the delimiter because it is whitespace otherwise
and the shell would split the args into different arguments there
otherwise.  But I see that you are using MS and the command.com there
is quite different.  I only know the GNU and Unix way.

> cut "--delimiter= " -f 2 file_in > file_out
> or
> cut "-d " -f 2 file_in > file_out

But field 2 is empty.  It would need to be field three given your data.

  cut -d" " -f3 file_in

> Another solution is using 'unexpand' to replace spaces by tabs

Yes.  'cut' was really designed to work on TAB separated fields.
Using 'unexpand' to get to TAB separated fields is in the spirit of
the way cut is designed to work.  Today if 'cut' were designed again
it would probably be using a comma by default because .csv files are
more common today than TAB separate fields.  I would be inclined to go
this way because then things are more obvious because the delimiters
are visible.

  echo "2  1.2  1.5  1.6" | tr " " ","
  2,,1.2,,1.5,,1.6

There it is clear that field 2 is empty.

The better answer to all of this is to use 'awk' because it handles
generic whitespace as a field separator.  Try this:

  awk '{print$2}' file_in

Bob




reply via email to

[Prev in Thread] Current Thread [Next in Thread]