bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-grep] Testing for UTF-8 bugs [was: Applying outstanding patches]


From: Julian Foad
Subject: [bug-grep] Testing for UTF-8 bugs [was: Applying outstanding patches]
Date: Thu, 28 Apr 2005 20:06:59 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b) Gecko/20050217

Charles Levert wrote:
* On Thursday 2005-04-28 at 17:27:06 +0100, Julian Foad wrote:
when the script is run from within "make check" the LANG environment variable does not get through from the calling line in my script to the function defined in the same script.

Isn't using LANG instead of the higher priority
LC_ALL risky in any case?

Good point - in fact, that was the problem (the test suite sets LC_ALL=C).  
Thanks.

Does this attached patch look right?

- Julian
Index: ChangeLog
===================================================================
RCS file: /cvsroot/grep/grep/ChangeLog,v
retrieving revision 1.237
diff -u -3 -p -d -r1.237 ChangeLog
--- ChangeLog   28 Apr 2005 15:17:13 -0000      1.237
+++ ChangeLog   28 Apr 2005 19:05:19 -0000
@@ -2,6 +2,10 @@
 
        * tests/foad1.sh: Remove Bash-specific syntax.
 
+       * src/dfa.c: Fix a bug whereby a bracket "[" was matched by the
+         pattern "[[:alpha:]]" in UTF-8 locales.  Patch #3800, by Tim Waugh.
+       * tests/foad1.sh: Add a regression test for that.
+
 2005-04-27  Julian Foad  <address@hidden>
 
        Fix a bug reported by Elliott Hughes in patch #1834 whereby "grep -Fw"
Index: src/dfa.c
===================================================================
RCS file: /cvsroot/grep/grep/src/dfa.c,v
retrieving revision 1.34
diff -u -3 -p -d -r1.34 dfa.c
--- src/dfa.c   11 Apr 2005 22:36:32 -0000      1.34
+++ src/dfa.c   28 Apr 2005 19:05:20 -0000
@@ -632,7 +632,7 @@ parse_bracket_exp_mb ()
                      work_mbc->coll_elems[work_mbc->ncoll_elems++] = elem;
                    }
                }
-             wc = WEOF;
+             wc1 = wc = WEOF;
            }
          else
            /* We treat '[' as a normal character here.  */
Index: tests/foad1.sh
===================================================================
RCS file: /cvsroot/grep/grep/tests/foad1.sh,v
retrieving revision 1.5
diff -u -3 -p -d -r1.5 foad1.sh
--- tests/foad1.sh      28 Apr 2005 15:17:14 -0000      1.5
+++ tests/foad1.sh      28 Apr 2005 19:05:21 -0000
@@ -80,4 +80,13 @@ grep_test "A/CX/B/C/" "A/B/C/" -wF -e A 
 grep_test "LIN7C 55327/" "" -wF -e 5327 -e 5532
 
 
+# Test character class erroneously matching a '[' character.
+# If the UTF-8 locale doesn't work, skip this test silently.
+if LC_ALL=cs_CZ.UTF-8 locale -k LC_CTYPE 2>/dev/null |
+  "${GREP}" -q "charmap.*UTF-8"
+then
+  LC_ALL=cs_CZ.UTF-8 grep_test "[/" "" "[[:alpha:]]" -E
+fi
+
+
 exit $failures

reply via email to

[Prev in Thread] Current Thread [Next in Thread]