bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/9] dfa: fix handling of ranges in multibyte character sets


From: Paolo Bonzini
Subject: Re: [PATCH 2/9] dfa: fix handling of ranges in multibyte character sets
Date: Mon, 15 Mar 2010 12:24:59 +0100
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.3


Well, I would really like a test that passes with,
and fails without, that fix, so how about using something like this:

This shows that grep-2.5.3 gets it wrong:

     $ printf '%s\n' A Z | LC_ALL=en_US.UTF-8 grep -i '[a-z]'
     A

and with your fix, grep -i does what we would expect:

     $ printf '%s\n' A Z | LC_ALL=en_US.UTF-8 src/grep -i '[a-z]'
     A
     Z

Great, I'll squash this in:

diff --git a/tests/case-fold-char-range b/tests/case-fold-char-range
index e683da9..9b3120f 100644
--- a/tests/case-fold-char-range
+++ b/tests/case-fold-char-range
@@ -3,18 +3,19 @@
 : ${srcdir=.}
 . "$srcdir/init.sh"; path_prepend_ ../src

-printf 'Y\n'      > exp1 || framework_failure
+printf 'A\nZ\n'      > exp1 || framework_failure
 fail=0

 for LOC in en_US.UTF-8 zh_CN $LOCALE_FR_UTF8; do
-  printf '1\nY\n.\n' | LC_ALL=$LOC grep -i '[a-z]' > out1 || fail=1
+  printf 'A\n1\nZ\n.\n' | LC_ALL=$LOC grep -i '[a-z]' > out1 || fail=1
   compare out1 exp1 || fail=1
 done

-printf 'y\n'      > exp2 || framework_failure
+# This actually passes also for grep-2.5.3
+printf 'a\nz\n'      > exp2 || framework_failure

 for LOC in en_US.UTF-8 zh_CN $LOCALE_FR_UTF8; do
-  printf '1\ny\n.\n' | LC_ALL=$LOC grep -i '[A-Z]' > out2 || fail=1
+  printf 'a\n1\nz\n.\n' | LC_ALL=$LOC grep -i '[A-Z]' > out2 || fail=1
   compare out2 exp2 || fail=1
 done


(tested to fail before and pass after my patch)

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]