|
From: | Chris F.A. Johnson |
Subject: | Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z? |
Date: | Mon, 21 May 2012 15:21:26 -0400 (EDT) |
User-agent: | Alpine 2.00 (LMD 1167 2008-08-23) |
On Mon, 21 May 2012, Linda Walsh wrote:
Greg Wooledge wrote:On Sun, May 20, 2012 at 11:36:35AM -0700, Linda Walsh wrote:For instance, on HP-UX 10.20, in the en_US.iso88591 locale: A a ... B b Meanwhile, on Debian 6.0, in the en_US.iso88591 locale: a A ... b B As you can see, the two en_US.iso88591 implementations are not the same.---- Great!... So which is correct? Anyone wanting to reference an upper or lower case range [a-z] or [A-Z], is gonna hurt from this.
Use the correct references: [:upper:] and [:lower:] or (as I do) always use LC_ALL=C in your scripts.
My OS uses "en_US.UTF-8".
My OS uses whatever I tell it to (which is C).
You'd think unicode would have something to say about collation order that wouldn't allow such randomness, but maybe not.
-- Chris F.A. Johnson, <http://cfajohnson.com/> Author: Pro Bash Programming: Scripting the GNU/Linux Shell (2009, Apress) Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
[Prev in Thread] | Current Thread | [Next in Thread] |