bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep -F causes infinite loop


From: Jim Meyering
Subject: Re: grep -F causes infinite loop
Date: Sat, 27 Apr 2013 18:41:48 +0200

GOTO, Daisuke wrote:
> Hello, there,
>
> (Since I was mistaken in the e-mail place, I re-mail.)
>
> grep -F causes infinite loop in a text which LOCALE differ.
> (LOCALE is a ja_JP.UTF-8, and text is a SJIS)
>
> It did not occur with an old version(GNU grep 2.6.1 or before).
> Moreover, also when there is no LOCALE, it does not occur.
...
> # printf '\202\240\202\240' | grep -F $'\202\240'

Thank you very much for that bug report.
This infloops for me on F18, and probably in any UTF-8 locale:

    $ printf '\202\240\202\240' | LC_ALL=en_US.UTF-8 grep $'\202\240'

Here's one way to fix it, making it so grep reports no match.
While it's nearly the smallest change to avoid the infloop,
I'm debating whether we need something else.

diff --git a/src/kwsearch.c b/src/kwsearch.c
index 96da58e..61acbe7 100644
--- a/src/kwsearch.c
+++ b/src/kwsearch.c
@@ -111,11 +111,9 @@ Fexecute (char const *buf, size_t size, size_t *match_size,
           mbstate_t s;
           memset (&s, 0, sizeof s);
           size_t mb_len = mbrlen (mb_start, (buf + size) - (beg + offset), &s);
-          if (mb_len == (size_t) -2)
+          if (mb_len == (size_t) -2 || mb_len == (size_t) -1)
             goto failure;
-          beg = mb_start;
-          if (mb_len != (size_t) -1)
-            beg += mb_len - 1;
+          beg = mb_start + mb_len - 1;
           continue;
         }
       beg += offset;

I.e., the above makes grep exit with status "1", meaning no match,
which seems less than ideal.  An alternative is to report that grep has
encountered an invalid multibyte sequence and to exit with status 2.
That's what most other versions of grep do.  Here's what Solaris 10's
grep does.  This demonstrates that it's the invalid sequence in the
search string (not in the input) that triggers the diagnostic:

    $ : | LC_ALL=en_US.UTF-8 /bin/grep $'\202\240'
    grep: RE error 67: Illegal byte sequence.
    [Exit 2]



reply via email to

[Prev in Thread] Current Thread [Next in Thread]