|
From: | GNU bug Tracking System |
Subject: | [debbugs-tracker] bug#19142: closed (sort not working with LANG set to language_country.encoding) |
Date: | Fri, 21 Nov 2014 17:00:05 +0000 |
Your message dated Fri, 21 Nov 2014 09:59:20 -0700 with message-id <address@hidden> and subject line Re: bug#19142: sort not working with LANG set to language_country.encoding has caused the debbugs.gnu.org bug report #19142, regarding sort not working with LANG set to language_country.encoding to be marked as done. (If you believe you have received this mail in error, please contact address@hidden) -- 19142: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19142 GNU Bug Tracking System Contact address@hidden with problems
--- Begin Message ---Subject: sort not working with LANG set to language_country.encoding Date: Fri, 21 Nov 2014 12:24:56 +0100 Hi.
I have noticed that sort seems to have problems when the LANG environment variable is set with language and country.
As a test case, i tried to sort
a
b
a
⺌
⺕
⺌
It sorts OK like this, with LANG just the language.encoding:
( setenv LANG en.UTF-8 ; echo 'a\nb\na\n⺌\n⺕\n⺌' | sort )
a
a
b
⺌
⺌
⺕
But not with LANG as language_country.encoding:
( setenv LANG en_GB.UTF-8 ; echo 'a\nb\na\n⺌\n⺕\n⺌' | sort )
⺌
⺕
⺌
a
a
b
sort: sort (GNU coreutils) 8.21
Shell: tcsh 6.18.01 (Astron) 2012-02-14 (x86_64-unknown-linux) options wide,nls,dl,al,kan,rh,color,filec
Fedora Linux 20Regards, ospalh
--- End Message ---
--- Begin Message ---Subject: Re: bug#19142: sort not working with LANG set to language_country.encoding Date: Fri, 21 Nov 2014 09:59:20 -0700 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 tag 19142 notabug thanks On 11/21/2014 04:24 AM, Roland Sieker wrote: > Hi. > > I have noticed that sort seems to have problems when the LANG environment > variable is set with language and country. > Thanks for the report. The whole point of locales is that each locale is free to choose the collation sequences that make the most sense for that locale. > It sorts OK like this, with LANG just the language.encoding: > ( setenv LANG en.UTF-8 ; echo 'a\nb\na\n⺌\n⺕\n⺌' | sort ) [I'm translating your csh syntax into more-reliable sh syntax] Try turning on sort debugging: $ printf 'a\nb\na\n⺌\n⺕\n⺌' | LC_ALL=en.UTF-8 sort --debug sort: using simple byte comparison a _ a _ b _ ⺌ ___ ⺌ ___ ⺕ ___ > But not with LANG as language_country.encoding: $ printf 'a\nb\na\n⺌\n⺕\n⺌' | LC_ALL=en_GB.UTF-8 sort --debug sort: using ‘en_GB.UTF-8’ sorting rules ⺌ __ ⺕ __ ⺌ __ a _ a _ b _ That just means that whoever wrote the en_GB.UTF-8 locale picked a different collation sequence for non-ascii characters than the person that wrote the generic en.UTF-8 locale. That's not a bug in sort, so I'm closing this as not a bug from coreutils' perspective. Feel free to raise it as a glibc bug (the owner of locale definitions on GNU/Linux systems) if you have a strong reason why different locales should be more consistent on their choice of collation sequences. And feel free to reply further to this bug with more questions or comments, even though it has been closed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.orgsignature.asc
Description: OpenPGP digital signature
--- End Message ---
[Prev in Thread] | Current Thread | [Next in Thread] |