[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #36567] grep -i (case-insensitive) is broken with UTF8
From: |
Strahinja Kustudic |
Subject: |
[bug #36567] grep -i (case-insensitive) is broken with UTF8 |
Date: |
Thu, 31 May 2012 11:18:30 +0000 |
User-agent: |
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:11.0) Gecko/20100101 Firefox/11.0 |
URL:
<http://savannah.gnu.org/bugs/?36567>
Summary: grep -i (case-insensitive) is broken with UTF8
Project: grep
Submitted by: kustodian
Submitted on: Thu 31 May 2012 11:18:30 AM GMT
Category: None
Severity: 3 - Normal
Item Group: None
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
Since version 2.6.1 grep doesn't work correctly if you use a case-insesitive
search with UTF8 encoding when there is an UTF8 character. Here is the
example:
# Without -i switch everything works correctly
$ echo -e 'AA UTF8 char İ 12345\nAA 12345' | grep 'AA'
AA UTF8 char İ 12345
AA 12345
# With -i it breaks
$ echo -e 'AA UTF8 char İ 12345\nAA 12345' | grep -i 'AA'
AA UTF8 char İ 12345AA 12345
As you can see it somehow deletes the new line character in the line which has
an UTF8 'İ' character.
Everything works correctly in versions 2.5.4 and below, it's broken from 2.6.1
to the latest version (which is atm 2.6.12).
This is a big concern, since it can break scripts which filtered UTF8 input
using -i switch.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?36567>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [bug #36567] grep -i (case-insensitive) is broken with UTF8,
Strahinja Kustudic <=