bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] 4.0.0 Regex Patterns Choke on Exotic Chars


From: David Millis
Subject: Re: [bug-gawk] 4.0.0 Regex Patterns Choke on Exotic Chars
Date: Mon, 12 Sep 2011 03:00:54 -0700 (PDT)

> > Date: Sun, 11 Sep 2011 21:02:25 +0300
> > From: Eli Zaretskii <address@hidden>
> > Subject: Re: [bug-gawk] 4.0.0 Regex Patterns Choke on Exotic Chars
> > To: address@hidden
> > Cc: address@hidden, address@hidden
> >
> > My binaries compiled by myself also show the problem.
> >
> > If you cannot reproduce this on GNU/Linux, even if you
> > set up the locale to use windows-1252 character set, then
> > I'd appreciate instructions to how to debug this.

I think the logic there is backward. 1252 chars should NOT be a problem in 
their native locale. If Linux, in some other locale, doesn't mind to begin 
with, switching TO 1252 shouldn't cause any informative errors. In other words, 
Linux is doing something right that the Windows builds need to match.

If vulnerable windows builds are made with a more restrictive locale than 1252 
(like strict Latin-1 UTF-8), that may be the cause. That'd be something to 
reproduce on Linux. The odd thing is middledot (\xb7), the bonus char I 
mentioned, should be more widely legal among locales, but it exhibits the same 
failure.


David



reply via email to

[Prev in Thread] Current Thread [Next in Thread]