bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Segfault when using higher ascii range in regexp


From: Hermann Peifer
Subject: Re: [bug-gawk] Segfault when using higher ascii range in regexp
Date: Sun, 29 May 2016 17:10:07 +0200
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.7.2

On 2016-05-29 15:07, Andrew J. Schorr wrote:
> Hi,
> 
> On Sat, May 28, 2016 at 09:58:03PM +0200, Jaromir Obr wrote:
>> steps to reproduce:
>>
>> $ printf "a"|awk '/[\x80-\x81]/ {count++}; END {print count}'
>> awk: cmd. line:1: fatal error: internal error: segfault
>> Aborted (core dumped)
>>
>> or simply:
>> $ printf "a"|awk '/[\x7f-\x80]/'
>> awk: cmd. line:1: fatal error: internal error: segfault
>> Aborted (core dumped)
> 
> I believe this issue has already been fixed:
> 
> bash-4.2$ gawk --version | head -1
> GNU Awk 4.1.3f, API: 1.1 (GNU MPFR 3.1.1, GNU MP 5.1.1)
> bash-4.2$ printf "a" | ./gawk '/[\x7f-\x80]/'
> bash-4.2$ 
> 
> Unfortunately, a new version with this fix has not yet been
> released, and I cannot easily find the relevant patch.
> 
> For the time being, you could grab a development version from
> the git repository:
> 
> https://www.gnu.org/software/gawk/manual/html_node/Accessing-The-Source.html
> 

The alternative would be: LC_ALL=C awk '/[\x7f-\x80]/'

If byte 0x80 occurred in the source data: there would anyway be a
mismatch between the data encoding and locale which one usually wants to
avoid.

Hermann



reply via email to

[Prev in Thread] Current Thread [Next in Thread]