--- Begin Message ---
Subject: |
sort fails on some UTF-8 input |
Date: |
Wed, 2 Jun 2010 05:51:25 +0100 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
I'm using coreutils 8.5 on Solaris 10.
GNU 'sort' fails to sort some input, while Solaris 'sort' handles it
correctly:
willow% /opt/ts/gnu/bin/sort sort_test.txt
/opt/ts/gnu/bin/sort: string comparison failed: Illegal byte sequence
/opt/ts/gnu/bin/sort: Set LC_ALL='C' to work around the problem.
/opt/ts/gnu/bin/sort: The strings compared were
`\360\222\203\276\360\222\205\226' and
`\360\222\200\255\360\222\213\253\360\222\213\253\360\222\200\255'.
willow% /usr/bin/sort sort_test.txt
πΎπ
ππ«π«π
willow%
I've attached the example file sort_test.txt.
- river.
sort_test.txt
Description: Text document
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#6327: sort fails on some UTF-8 input |
Date: |
Mon, 08 Aug 2011 08:27:56 +0200 |
River Tarnell wrote:
> I'm using coreutils 8.5 on Solaris 10.
>
> GNU 'sort' fails to sort some input, while Solaris 'sort' handles it
> correctly:
>
> willow% /opt/ts/gnu/bin/sort sort_test.txt
> /opt/ts/gnu/bin/sort: string comparison failed: Illegal byte sequence
> /opt/ts/gnu/bin/sort: Set LC_ALL='C' to work around the problem.
> /opt/ts/gnu/bin/sort: The strings compared were
> `\360\222\203\276\360\222\205\226' and
> `\360\222\200\255\360\222\213\253\360\222\213\253\360\222\200\255'.
> willow% /usr/bin/sort sort_test.txt
> πΎπ
> ππ«π«π
> willow%
>
> I've attached the example file sort_test.txt.
Thanks for the report.
Since this appears not to be due to any problem
with GNU sort per se, but rather with solaris'
strcoll implementation, I'm closing this coreutils "issue"
and Cc'ing bug-gnulib, in case someone there wants to
pursue the strcoll-replacement approach.
--- End Message ---