bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk ignores case with LANG=en_US


From: Aharon Robbins
Subject: Re: gawk ignores case with LANG=en_US
Date: Wed, 13 May 2009 19:05:08 +0300

Hi.

With locales, ranges don't mean what you think they mean. This is discussed
in the gawk documentation. Use

        /^[[:lower:]]/ { print }

to get what you want.

Thanks,

Arnold

> Subject: gawk ignores case with LANG=en_US
> From: Jim Keniston <address@hidden>
> To: address@hidden
>
> --- bug.awk ---
> /^[a-z]/ { print }
> --- input ---
> 1
> a
> A
> --- output_buggy ---
> a
> A
> --- output_expected ---
> a
> -----
> Repeat by:
> $ gawk -f bug.awk < input
>
> Assuming that environment variables LC_ALL and LC_CTYPE are
> undefined, if I run the above with the LANG environment variable
> set to "en_US.utf8" or "en_US", "A" matches "^[a-z]" and the
> output is as in output_buggy.  Setting IGNORECASE=0 in the
> command line or the script doesn't help.
>
> If I do
> $ LANG= gawk -f bug.awk < input
> I get the expected output.
>
> gawk version: GNU Awk 3.1.5
> OS: RH Fedora 9 + Linux v2.6.29-rc8
>
> Jim Keniston
> IBM Linux Technology Center
> Beaverton, OR
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]