bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/9] dfa: fix handling of ranges in multibyte character sets


From: Jim Meyering
Subject: Re: [PATCH 2/9] dfa: fix handling of ranges in multibyte character sets
Date: Mon, 15 Mar 2010 15:00:39 +0100

Paolo Bonzini wrote:

>> Well, I would really like a test that passes with,
>> and fails without, that fix, so how about using something like this:
>>
>> This shows that grep-2.5.3 gets it wrong:
>>
>>      $ printf '%s\n' A Z | LC_ALL=en_US.UTF-8 grep -i '[a-z]'
>>      A
>>
>> and with your fix, grep -i does what we would expect:
>>
>>      $ printf '%s\n' A Z | LC_ALL=en_US.UTF-8 src/grep -i '[a-z]'
>>      A
>>      Z
>
> Great, I'll squash this in:
>
> diff --git a/tests/case-fold-char-range b/tests/case-fold-char-range
> index e683da9..9b3120f 100644
> --- a/tests/case-fold-char-range
> +++ b/tests/case-fold-char-range
> @@ -3,18 +3,19 @@
>  : ${srcdir=.}
>  . "$srcdir/init.sh"; path_prepend_ ../src
>
> -printf 'Y\n'      > exp1 || framework_failure
> +printf 'A\nZ\n'      > exp1 || framework_failure
>  fail=0
>
>  for LOC in en_US.UTF-8 zh_CN $LOCALE_FR_UTF8; do
> -  printf '1\nY\n.\n' | LC_ALL=$LOC grep -i '[a-z]' > out1 || fail=1
> +  printf 'A\n1\nZ\n.\n' | LC_ALL=$LOC grep -i '[a-z]' > out1 || fail=1
>    compare out1 exp1 || fail=1
>  done
>
> -printf 'y\n'      > exp2 || framework_failure
> +# This actually passes also for grep-2.5.3
> +printf 'a\nz\n'      > exp2 || framework_failure
>
>  for LOC in en_US.UTF-8 zh_CN $LOCALE_FR_UTF8; do
> -  printf '1\ny\n.\n' | LC_ALL=$LOC grep -i '[A-Z]' > out2 || fail=1
> +  printf 'a\n1\nz\n.\n' | LC_ALL=$LOC grep -i '[A-Z]' > out2 || fail=1
>    compare out2 exp2 || fail=1
>  done
>
> (tested to fail before and pass after my patch)

Perfect.
Please add a comment something like this just before
your changed lines in dfa.c:

    /* Map a case-folded range, say [m-z] (or even [M-z]) to the
       pair of ranges, [m-z] [M-Z].  */

Then, this one is good to go.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]