autoconf
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: position of setting locale-related variables in acgeneral.m4?


From: Paul Eggert
Subject: Re: position of setting locale-related variables in acgeneral.m4?
Date: Tue, 17 Oct 2000 11:13:21 -0700 (PDT)

   From: Akim Demaille <address@hidden>
   User-Agent: Gnus/5.0807 (Gnus v5.8.7) XEmacs/21.1 (Channel Islands)

   The failure is extremely famous, and there is an incredible amount of
   news in gnu.bug.utils about this but in the context of grep.  It is
   simply that some locales have more letters in between a and b.

It is worse than that.  In some locales, [a-z] does not match 'b'.
And in some locales, [a-z] can match a string with two (or more!)
characters.  It can even match nondeterministically: i.e. [a-z]
might match both "c" and "ch".

An amusing instance of the problem is that in some locales, [^ch]+ can
match "ch".  This is because if "ch" is a multicharacter collating
element, then [^ch] must match "ch", because it is neither "c" nor
"h".  Therefore, [^ch]+ matches "ch".

It is difficult to find these anomalies purely by testing.
And all of this behavior is completely conforming to POSIX.2.

Confused?  Surprised?  You're not alone.
The most portable rule of thumb is:

* Avoid range expressions like [a-z].
* Avoid non-matching list expressions like [^aeiou].
* If you must use expressions like the above,
  use them only in the "C" locale.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]