bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug #33198] Incorrect bracket expression when parsing in ru_RU.KOI8


From: Jim Meyering
Subject: Re: [bug #33198] Incorrect bracket expression when parsing in ru_RU.KOI8-R (Russian locale)
Date: Sat, 07 May 2011 15:27:12 +0200

The problem is more serious than I first thought
because it also affects the C locale.  For example,
this should print grep's input line, but instead prints FAIL:

    printf '\xff\n'|LC_ALL=C grep "$(printf '[\xff]')" || echo FAIL

This fixes it and adds a test:

>From 8da41c930e03a8635cbd8c89e3e591374c232c89 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Sat, 7 May 2011 14:25:10 +0200
Subject: [PATCH 1/2] fix a bug whereby echo c|grep '[c]' would fail for any c
 in 0x80..0xff

* src/dfa.c (setbit_case_fold) [MBS_SUPPORT]: Set the bit also
when wctob returns EOF.
* NEWS (Bug fixes): Mention it.
---
 NEWS      |    6 ++++++
 src/dfa.c |    3 ++-
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/NEWS b/NEWS
index 9e9974a..0cbf9ab 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,11 @@ GNU grep NEWS                                    -*- outline 
-*-

 ** Bug fixes

+  echo c|grep '[c]' would fail for any c in 0x80..0xff, and in many locales.
+  E.g., printf '\xff\n'|grep "$(printf '[\xff]')" || echo FAIL
+  would print FAIL rather than the required matching line.
+  [bug introduced in grep-2.6]
+
   grep's interpretation of range expression is now more consistent with
   that of other tools.  [bug present since multi-byte character set
   support was introduced in 2.5.2, though the steps needed to reproduce
@@ -12,6 +17,7 @@ GNU grep NEWS                                    -*- outline 
-*-
   grep erroneously returned with exit status 1 on some memory allocation
   failure. [bug present since "the beginning"]

+
 * Noteworthy changes in release 2.7 (2010-09-16) [stable]

 ** Bug fixes
diff --git a/src/dfa.c b/src/dfa.c
index f2064ed..b41cbb6 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -573,7 +573,8 @@ setbit_case_fold (
   else
     {
 #if MBS_SUPPORT
-      if (wctob ((unsigned char)b) == b)
+      int b2 = wctob ((unsigned char) b);
+      if (b2 == EOF || b2 == b)
 #endif
         setbit (b, c);
     }
--
1.7.5.1.299.g6e1e4


>From d98338ebf842ec9b69631837eee50ebdcd543505 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Wed, 4 May 2011 13:07:36 +0200
Subject: [PATCH 2/2] tests: exercise bug with 0x80..0xff in [...]
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* tests/high-bit-range: New test, inspired by an example in the
report by Igor O. Ladygin: http://bugs.debian.org/624387,
via Santiago Ruano Rincón's http://savannah.gnu.org/bugs/?33198
* tests/Makefile.am (TESTS): Add it.
---
 THANKS               |    2 ++
 tests/Makefile.am    |    1 +
 tests/high-bit-range |   28 ++++++++++++++++++++++++++++
 3 files changed, 31 insertions(+), 0 deletions(-)
 create mode 100644 tests/high-bit-range

diff --git a/THANKS b/THANKS
index 116b9c4..9ee6be3 100644
--- a/THANKS
+++ b/THANKS
@@ -37,6 +37,7 @@ H. Merijn Brand            <address@hidden>
 Harald Hanche-Olsen        <address@hidden>
 Hans-Bernhard Broeker      <address@hidden>
 Heikki Korpela             <address@hidden>
+Igor O. Ladygin            <address@hidden>
 Ilya Basin                 <address@hidden>
 Isamu Hasegawa             <address@hidden>
 Jaroslav Škarvada          <address@hidden>
@@ -76,6 +77,7 @@ Philippe De Muyter         <address@hidden>
 Philip Hazel               <address@hidden>
 Roland Roberts             <address@hidden>
 Ruslan Ermilov             <address@hidden>
+Santiago Ruano Rincón      <address@hidden>
 Santiago Vila              <address@hidden>
 Shannon Hill               <address@hidden>
 Sotiris Vassilopoulos      <address@hidden>
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 7233c01..53314a8 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -63,6 +63,7 @@ TESTS =                                               \
   inconsistent-range                            \
   khadafy                                      \
   max-count-vs-context                         \
+  high-bit-range                               \
   options                                      \
   pcre                                         \
   pcre-z                                       \
diff --git a/tests/high-bit-range b/tests/high-bit-range
new file mode 100644
index 0000000..d150633
--- /dev/null
+++ b/tests/high-bit-range
@@ -0,0 +1,28 @@
+#!/bin/sh
+# Exercise high-bit-set unibyte-in-[...]-range bug.
+
+# Copyright (C) 2011 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+
+printf '\x81\n' > in || framework_failure_
+grep "$(printf '[\x81]')" in > out || fail=1
+
+compare out in || fail=1
+
+Exit $fail
--
1.7.5.1.299.g6e1e4



reply via email to

[Prev in Thread] Current Thread [Next in Thread]