[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Segfault when using higher ascii range in regexp
From: |
Hermann Peifer |
Subject: |
Re: [bug-gawk] Segfault when using higher ascii range in regexp |
Date: |
Sun, 29 May 2016 17:10:07 +0200 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 |
On 2016-05-29 15:07, Andrew J. Schorr wrote:
> Hi,
>
> On Sat, May 28, 2016 at 09:58:03PM +0200, Jaromir Obr wrote:
>> steps to reproduce:
>>
>> $ printf "a"|awk '/[\x80-\x81]/ {count++}; END {print count}'
>> awk: cmd. line:1: fatal error: internal error: segfault
>> Aborted (core dumped)
>>
>> or simply:
>> $ printf "a"|awk '/[\x7f-\x80]/'
>> awk: cmd. line:1: fatal error: internal error: segfault
>> Aborted (core dumped)
>
> I believe this issue has already been fixed:
>
> bash-4.2$ gawk --version | head -1
> GNU Awk 4.1.3f, API: 1.1 (GNU MPFR 3.1.1, GNU MP 5.1.1)
> bash-4.2$ printf "a" | ./gawk '/[\x7f-\x80]/'
> bash-4.2$
>
> Unfortunately, a new version with this fix has not yet been
> released, and I cannot easily find the relevant patch.
>
> For the time being, you could grab a development version from
> the git repository:
>
> https://www.gnu.org/software/gawk/manual/html_node/Accessing-The-Source.html
>
The alternative would be: LC_ALL=C awk '/[\x7f-\x80]/'
If byte 0x80 occurred in the source data: there would anyway be a
mismatch between the data encoding and locale which one usually wants to
avoid.
Hermann