bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#8040: join 5.97 bug


From: Eric Blake
Subject: bug#8040: join 5.97 bug
Date: Mon, 14 Feb 2011 17:18:43 -0700
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.7

On 02/14/2011 04:45 PM, Batson, Brannon wrote:
> 
> File a:
> 10 A
> 1  B
> 
> File b:
> 1
> 
> $ join b a
> <nada>
> 
> $ join -v 1 d c
> 1

You didn't provide a file d or c to compare against.

> 
> files a & b are both sorted lexicographically  (according to 'sort', anyway). 
> The problem is that the join lexicographic '<' operator disagrees with sort's.

Thanks for the report.  I can't help but wonder if you've stumbled into
this:

http://www.gnu.org/software/coreutils/faq/#join-requires-sorted-input-files

At any rate, the only bug here is in your input files.

> 
> Sorry if this bug has been found like a thousand times before, couldn't find 
> it via 30s of googling.

Coreutils 5.97 is OLD.  The latest stable release is 8.10, and it has
improved diagnostics for helping you discover sorting problems with join:

join --help reminds you that:

Important: FILE1 and FILE2 must be sorted on the join fields.
E.g., use ` sort -k 1b,1 ' if `join' has no options,
or use ` join -t '' ' if `sort' has no options.

And trying your example with LC_ALL=en_US.UTF-8 gives:

$ join b a
join: file 2 is not in sorted order

Sure enough, using sort --debug to find the culprit (a was not sorted
according to -k 1b,1):

$ sort --debug a
sort: using `en_US.UTF-8' sorting rules
10 A
____
1  B
____
$ sort --debug -k 1b,1 a
sort: using `en_US.UTF-8' sorting rules
1  B
_
____
10 A
__
____


-- 
Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]