[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug #33198] Incorrect bracket expression when parsing in ru_RU.KOI8
From: |
Jim Meyering |
Subject: |
Re: [bug #33198] Incorrect bracket expression when parsing in ru_RU.KOI8-R (Russian locale) |
Date: |
Sat, 07 May 2011 15:27:12 +0200 |
The problem is more serious than I first thought
because it also affects the C locale. For example,
this should print grep's input line, but instead prints FAIL:
printf '\xff\n'|LC_ALL=C grep "$(printf '[\xff]')" || echo FAIL
This fixes it and adds a test:
>From 8da41c930e03a8635cbd8c89e3e591374c232c89 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Sat, 7 May 2011 14:25:10 +0200
Subject: [PATCH 1/2] fix a bug whereby echo c|grep '[c]' would fail for any c
in 0x80..0xff
* src/dfa.c (setbit_case_fold) [MBS_SUPPORT]: Set the bit also
when wctob returns EOF.
* NEWS (Bug fixes): Mention it.
---
NEWS | 6 ++++++
src/dfa.c | 3 ++-
2 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/NEWS b/NEWS
index 9e9974a..0cbf9ab 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,11 @@ GNU grep NEWS -*- outline
-*-
** Bug fixes
+ echo c|grep '[c]' would fail for any c in 0x80..0xff, and in many locales.
+ E.g., printf '\xff\n'|grep "$(printf '[\xff]')" || echo FAIL
+ would print FAIL rather than the required matching line.
+ [bug introduced in grep-2.6]
+
grep's interpretation of range expression is now more consistent with
that of other tools. [bug present since multi-byte character set
support was introduced in 2.5.2, though the steps needed to reproduce
@@ -12,6 +17,7 @@ GNU grep NEWS -*- outline
-*-
grep erroneously returned with exit status 1 on some memory allocation
failure. [bug present since "the beginning"]
+
* Noteworthy changes in release 2.7 (2010-09-16) [stable]
** Bug fixes
diff --git a/src/dfa.c b/src/dfa.c
index f2064ed..b41cbb6 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -573,7 +573,8 @@ setbit_case_fold (
else
{
#if MBS_SUPPORT
- if (wctob ((unsigned char)b) == b)
+ int b2 = wctob ((unsigned char) b);
+ if (b2 == EOF || b2 == b)
#endif
setbit (b, c);
}
--
1.7.5.1.299.g6e1e4
>From d98338ebf842ec9b69631837eee50ebdcd543505 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Wed, 4 May 2011 13:07:36 +0200
Subject: [PATCH 2/2] tests: exercise bug with 0x80..0xff in [...]
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* tests/high-bit-range: New test, inspired by an example in the
report by Igor O. Ladygin: http://bugs.debian.org/624387,
via Santiago Ruano Rincón's http://savannah.gnu.org/bugs/?33198
* tests/Makefile.am (TESTS): Add it.
---
THANKS | 2 ++
tests/Makefile.am | 1 +
tests/high-bit-range | 28 ++++++++++++++++++++++++++++
3 files changed, 31 insertions(+), 0 deletions(-)
create mode 100644 tests/high-bit-range
diff --git a/THANKS b/THANKS
index 116b9c4..9ee6be3 100644
--- a/THANKS
+++ b/THANKS
@@ -37,6 +37,7 @@ H. Merijn Brand <address@hidden>
Harald Hanche-Olsen <address@hidden>
Hans-Bernhard Broeker <address@hidden>
Heikki Korpela <address@hidden>
+Igor O. Ladygin <address@hidden>
Ilya Basin <address@hidden>
Isamu Hasegawa <address@hidden>
Jaroslav Škarvada <address@hidden>
@@ -76,6 +77,7 @@ Philippe De Muyter <address@hidden>
Philip Hazel <address@hidden>
Roland Roberts <address@hidden>
Ruslan Ermilov <address@hidden>
+Santiago Ruano Rincón <address@hidden>
Santiago Vila <address@hidden>
Shannon Hill <address@hidden>
Sotiris Vassilopoulos <address@hidden>
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 7233c01..53314a8 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -63,6 +63,7 @@ TESTS = \
inconsistent-range \
khadafy \
max-count-vs-context \
+ high-bit-range \
options \
pcre \
pcre-z \
diff --git a/tests/high-bit-range b/tests/high-bit-range
new file mode 100644
index 0000000..d150633
--- /dev/null
+++ b/tests/high-bit-range
@@ -0,0 +1,28 @@
+#!/bin/sh
+# Exercise high-bit-set unibyte-in-[...]-range bug.
+
+# Copyright (C) 2011 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+
+printf '\x81\n' > in || framework_failure_
+grep "$(printf '[\x81]')" in > out || fail=1
+
+compare out in || fail=1
+
+Exit $fail
--
1.7.5.1.299.g6e1e4