bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-grep] Bug in grep 2.5.1a (+ patch to fix it)


From: Gordon Lack
Subject: [bug-grep] Bug in grep 2.5.1a (+ patch to fix it)
Date: Wed, 27 Apr 2005 12:54:16 +0100

   There is a bug in grep 2.5.1a (in fact all 2.5*) when you use the -F
and -w options together (fixed strings, words only).

   It mainifests itself by not actually matching all occurences.  

   A simple example:

my-system*[1] grep --version
grep (GNU grep) 2.5.1

Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is
NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.

my-system*[1] cat test 
An unmatched line to make the file larger
Match 2908
An unmatched line to make the file larger
Non-match 31560
An unmatched line to make the file larger
Match 3440
An unmatched line to make the file larger
Match 3156
An unmatched line to make the file larger

my-system*[1] grep -wF -e 2908 -e 3440 -e 3156 test
Match 2908


   It gives only one match, but it should have given 3.

   The problem (which can be shown with larger files) is that if the
code finds a match for a string, but it then finds that this match does
not *end* at a word boundary, then the whole of the rest of the buffer
segment (32kB) is skipped.   (It does not skip if the match *starts* on
a non-word boundary, so changing 31560 to 03156 in the test file results
in the expected 3 matches).


   By looking at the 2.4.2 code, and also at the difference between the
match starting and ending at a non-word boundary I have worked out a
simple patch.  The problem is that when the match ended on a
non-boundary the code exited the Fexecute() function, whereas it should
just break from the loop.


   Applying the appended patch results in the correct result:


my-system*[1] ../grep-2.5.1a/src/grep -wF -e 2908 -e 3440 -e 3156 test
Match 2908
Match 3440
Match 3156
--- src/search.c.orig   2001-04-19 04:42:14.000000000 +0100
+++ src/search.c        2005-04-27 12:19:35.000000000 +0100
@@ -554,14 +554,7 @@
            if (try + len < buf + size && WCHAR((unsigned char) try[len]))
              {
                offset = kwsexec (kwset, beg, --len, &kwsmatch);
-               if (offset == (size_t) -1)
-                 {
-#ifdef MBS_SUPPORT
-                   if (MB_CUR_MAX > 1)
-                     free (mb_properties);
-#endif /* MBS_SUPPORT */
-                   return offset;
-                 }
+               if (offset == (size_t) -1) break;
                try = beg + offset;
                len = kwsmatch.size[0];
              }

reply via email to

[Prev in Thread] Current Thread [Next in Thread]