bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tranpose bug with comma field-separator


From: Erik Auerswald
Subject: Re: tranpose bug with comma field-separator
Date: Fri, 4 Jun 2021 14:35:10 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

Hi,

Charles Read wrote on Fri, 13 Mar 2020 23:36:58 -0000:
> Steps to reproduce:
> 
>      1. Put this data in the file /tmp/z0.txt:
> 
> A,B,C,D,E,F,G,H,I,J
> 0,1,2,3,4,5,6,7,8,9
> 
>      2. Then run:
> 
> $ datamash --field-separator=, transpose < /tmp/z0.txt 
> 
> A,0
> B,1
> C,2
> D,3
> E,4
> F,5
> G,6
> H,7
> I,8
> ,9
> 
> Notice the last transposed row omits the column-name "J".

I can reproduce this behavior on GNU/Linux with several versions of GNU
datamash, but only when using CRLF as line ending (as on Windows), not
with LF as line ending (as on GNU/Linux or macOS):

$ printf -- 'A,B\n0,1\n' | datamash --field-separator=, transpose
A,0
B,1
$ printf -- 'A,B\r\n0,1\r\n' | datamash --field-separator=, transpose
A,0
,1
$ printf -- 'A,B\r\n0,1\r\n' | datamash --field-separator=, transpose | cat -A
A,0$
B^M,1^M$

GNU datamash works with POSIX text files, as is usual on GNU/Linux, i.e.,
each line is terminated by an LF character ('\n').  The CR character ('\r')
is not treated specially by GNU datamash, it is treated as any other field
contents.  When displayed in a terminal, CR moves the cursor to the first
column, and thus further output overwrites earlier output.

You can turn CRLF line endings into LF using a simple sed invokation (this
works with GNU sed, at least):

$ printf -- 'A,B\r\n0,1\r\n' | sed 's/\r$//' | datamash --field-separator=, 
transpose
A,0
B,1

I do not think that this is a bug in GNU datamash.

Thanks,
Erik
-- 
https://www.unix-ag.uni-kl.de/~auerswal/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]