bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Sort and LC_COLLATE and utf


From: felix
Subject: Sort and LC_COLLATE and utf
Date: Thu, 4 May 2006 08:22:17 -0700
User-agent: Mutt/1.5.11

I have discovered an inconsistency in how perl and sort handle locale.
Here are two commands you can run in a shell ...

    (echo '/'; echo '?') | sort

    (echo '/'; echo '?') | perl -e '@x = <>; print $_ foreach sort @x'

With LANG=en_US.UTF-8, sort says ? comes before /, perl says the
opposite.  Setting LC_COLLATE=C switches the sort behavior.  I have no
idea what else changes or where else perl and sort disagree, or what
other programs do.  For all I know, UTF mandates puncuation sorts
before other things.  I don't know what is proper behavior other than
being different isn't :-)

I have tried this on two systems.  One system is gentoo, sort version
5.94, perl version 5.8.8.  The other system is rPath, sort version
5.2.1, perl version 5.8.7.  I can easily supply further info if you
need it or run other tests.  I am at your service!

-- 
            ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
     Felix Finch: scarecrow repairman & rocket surgeon / address@hidden
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o




reply via email to

[Prev in Thread] Current Thread [Next in Thread]