bug#9780: sort -u throws out non-duplicates

From: Bernhard Rosenkraenzer
Subject: bug#9780: sort -u throws out non-duplicates
Date: Tue, 18 Oct 2011 09:48:00 +0100
On Mon, 17 Oct 2011 20:22:52 -0600, Eric Blake wrote:
On 10/17/2011 06:59 PM, Bernhard Rosenkraenzer wrote:
Thanks for the report.  Unfortunately, you did not provide enough
information to reproduce this - for example, what platform are you
running on?

Fairly current Linux -- kernel 3.1-rc9, eglibc 2.14.1

 Can you narrow it down to a single file of say 5 or so
lines?  Can you reproduce the problem with shorter input lines?

address@hidden ~]$ echo 'libcore/luni/src/main/java/java/security/cert/X509CRLSelector.java libcore/luni/src/main/java/java/security/cert/X509CertSelector.java libcore/luni/src/main/java/java/security/cert/X509Certificate.java libcore/luni/src/main/java/javax/security/cert/X509Certificate.java' |tr ' ' '\n' |sort -u --debug
sort: using `en_US' sorting rules

It starts working correctly if any of the entries are removed, yet none of those should match as a duplicate as far as I can see.

My guess, although I need more info to confirm it, is that this is
not a bug, but rather that java-source-list contains some lines that
differ in case and/or punctuation but happen to collate identically.
If so, then sort -u is picking the lower-case version as the unique
line, at which point your grep for the case-sensitive X509Certificate
is obviously failing.

FWIW changing everything to lower case doesn't change anything
address@hidden ~]$ echo 'libcore/luni/src/main/java/java/security/cert/x509crlselector.java libcore/luni/src/main/java/java/security/cert/x509certselector.java libcore/luni/src/main/java/java/security/cert/x509certificate.java libcore/luni/src/main/java/javax/security/cert/x509certificate.java' |tr ' ' '\n' |sort -u --debug
sort: using `en_US' sorting rules


