bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Boyer Moore overflow patch


From: Charles Levert
Subject: Re: Boyer Moore overflow patch
Date: Sat, 16 Jul 2005 14:03:45 -0400
User-agent: Mutt/1.4.1i

* On Saturday 2005-07-16 at 14:43:10 +0100, Julian Foad wrote:
> Charles Levert wrote:
> >I think the only pending patch that's affected
> >is one you were working on to fix a bug.
> 
> Sorry to be so forgetful, but could you post the URL or tracker number of 
> the particular bug that you are talking about?

Boyer Moore overflow patch:
   <http://lists.gnu.org/archive/html/bug-grep/2005-06/msg00039.html>
   <http://lists.gnu.org/archive/html/bug-grep/2005-06/msg00040.html>
   <http://lists.gnu.org/archive/html/bug-grep/2005-06/msg00048.html>

The following patch is all that remains to be
done in this file.  Other issues mentioned in the
first message above have already been dealt with.


> >--- src/kwset.c      2005-07-04 01:14:37 -0400
> >+++ src/kwset.c      2005-07-04 04:50:38 -0400
> >@@ -404,7 +404,13 @@ kwsprep (kwset_t kws)
> >     }
> >       /* Build the Boyer Moore delta.  Boy that's easy compared to CW. */
> >       for (i = 0; i < kwset->mind; ++i)
> >-    delta[U(kwset->target[i])] = kwset->mind - (i + 1);
> >+    {
> >+      int d = kwset->mind - (i + 1);
> >+
> >+      if (UCHAR_MAX < d)
> >+        d = UCHAR_MAX;
> >+      delta[U(kwset->target[i])] = d;
> >+    }
> >       /* Find the minimal delta2 shift that we might make after
> >      a backwards match has failed. */
> >       c = kwset->target[kwset->mind - 1];

Computing d first, as opposed to using a MIN()
macro straight out, avoids relying on the
compiler's optimizer to recognize and not compute
"kwset->mind - (i + 1)" twice.


> >along with tests that would be surrounded by a
> >
> >   for mode in F G E
> >
> >loop in "tests/foad1.sh".
> 
> Please could you write those tests?

# Test for Boyer-Moore bugs.  Thanks to Steve Summit for reporting the bug
# and providing a test case which inspired these.
b17='bbbbbbbbbbbbbbbbb'
b85="$b17$b17$b17$b17$b17"
b255="$b85$b85$b85"
x16='xxxxxxxxxxxxxxxx'
x64="$x16$x16$x16$x16"
x256="$x64$x64$x64$x64"
bm="cbbba${b255}c"
bt="${x256}${bm}${x16}xx/"
for mode in F G E; do
  grep_test "a${b255}a/"  ''     "a${b255}b"  -$mode
  grep_test "$bt"         "$bt"  "$bm"        -$mode
done


> In another email you wrote:
> >(I still don't understand the second test case,

Here are the original tests (which differ by
their terminating slash).

First test:
   grep_test "a${b255}a" "" "a${b255}b"
Second test:
   grep_test "${x256}${bm}${x16}xx" "${x256}${bm}${x16}xx/" "$bm"

It's just that I haven't figured out for myself
what's the buggy behavior that the second test
is after (and that the first one isn't already
covering).




reply via email to

[Prev in Thread] Current Thread [Next in Thread]