[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Sort and LC_COLLATE and utf
From: |
felix |
Subject: |
Sort and LC_COLLATE and utf |
Date: |
Thu, 4 May 2006 08:22:17 -0700 |
User-agent: |
Mutt/1.5.11 |
I have discovered an inconsistency in how perl and sort handle locale.
Here are two commands you can run in a shell ...
(echo '/'; echo '?') | sort
(echo '/'; echo '?') | perl -e '@x = <>; print $_ foreach sort @x'
With LANG=en_US.UTF-8, sort says ? comes before /, perl says the
opposite. Setting LC_COLLATE=C switches the sort behavior. I have no
idea what else changes or where else perl and sort disagree, or what
other programs do. For all I know, UTF mandates puncuation sorts
before other things. I don't know what is proper behavior other than
being different isn't :-)
I have tried this on two systems. One system is gentoo, sort version
5.94, perl version 5.8.8. The other system is rPath, sort version
5.2.1, perl version 5.8.7. I can easily supply further info if you
need it or run other tests. I am at your service!
--
... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
Felix Finch: scarecrow repairman & rocket surgeon / address@hidden
GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o
- Sort and LC_COLLATE and utf,
felix <=