bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

grep dfa bug


From: KIMURA Koichi
Subject: grep dfa bug
Date: Mon, 01 Aug 2005 09:12:03 +0900

Hi,

I think I found bug of dfa of gawk.

Situation:
In Japanese ShiftJIS locale, half-witdth katakana in character class
does not match appropriately.

Reproduce:
set LANG=ja_JP.SJIS
export LANG
echo ABCDE | grep '/[A-E]\+/p'

Actually, A B C D E is half-width katakana character.
(data to reprodcue is appended at end of this mail (uuencoded SJIS data))

Result:
nothig printed.

I guess patch below solve this problem, but I'm not confident
that influence doesn't go out to other environments.

regards,

--- dfa.c.2~    2005-03-22 14:43:10.000000000 +0900
+++ dfa.c       2005-07-31 22:21:27.000000000 +0900
@@ -2825,7 +2825,8 @@ dfaexec (struct dfa *d, char const *begi
              remain_bytes
                = mbrtowc(inputwcs + i, begin + i,
                          end - (unsigned char const *)begin - i + 1, &mbs);
-             if (remain_bytes <= 1)
+             if (remain_bytes < 1
+                  || (remain_bytes == 1 && inputwcs[i] == (wchar_t)begin[i]))
                {
                  remain_bytes = 0;
                  inputwcs[i] = (wchar_t)begin[i];



begin 644 testkana.sh
M<V5T($Q!3D<]:F%?2E`N4TI)4PIE>'!O<address@hidden;F]T('!R:6YT"F5C!
<:&address@hidden;address@hidden"!G<F5P("<O6[$MM5U<*R\G"@``(
``
end
size 73

-- 
KIMRUA Koichi





reply via email to

[Prev in Thread] Current Thread [Next in Thread]