bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Case insensitivity seems to ignore lower bound of interval


From: Paul Jarc
Subject: Re: Case insensitivity seems to ignore lower bound of interval
Date: Thu, 28 Apr 2011 00:17:07 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux)

Eric Bischoff <address@hidden> wrote:
> 1) Contradiction with the documentation :
>
> http://www.gnu.org/software/gawk/manual/gawk.html#Locales says that
>
>      $ echo something1234abc | gawk '{ sub("[A-Z]*$", ""); print }'
>
> returns
>
>       something1234

That example behaves as described in the documentation for some
locales, but not in others (such as yours, apparently).  That's the
whole point of that section of the documentation--different locales
have different behavior for character ranges.

Note that case-insensitivity is not an intended feature at all.  It's
just an accidental result of the character collation of some locales.
Some locales arrange characters in the order aAbBcC...zZ, so a range
like [A-Z] includes all upper- and lowercase letters except lowercase
a.  Other locales may arrange them as AaBbCc...Zz, so [A-Z] excludes
lowercase z instead.  But the usual expectation, and the actual
behavior in the C locale, is that [A-Z] includes only uppercase
letters, and [a-z] includes only lowercase letters.


paul



reply via email to

[Prev in Thread] Current Thread [Next in Thread]