[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Boyer Moore overflow patch
From: |
Charles Levert |
Subject: |
Re: Boyer Moore overflow patch |
Date: |
Sat, 16 Jul 2005 14:03:45 -0400 |
User-agent: |
Mutt/1.4.1i |
* On Saturday 2005-07-16 at 14:43:10 +0100, Julian Foad wrote:
> Charles Levert wrote:
> >I think the only pending patch that's affected
> >is one you were working on to fix a bug.
>
> Sorry to be so forgetful, but could you post the URL or tracker number of
> the particular bug that you are talking about?
Boyer Moore overflow patch:
<http://lists.gnu.org/archive/html/bug-grep/2005-06/msg00039.html>
<http://lists.gnu.org/archive/html/bug-grep/2005-06/msg00040.html>
<http://lists.gnu.org/archive/html/bug-grep/2005-06/msg00048.html>
The following patch is all that remains to be
done in this file. Other issues mentioned in the
first message above have already been dealt with.
> >--- src/kwset.c 2005-07-04 01:14:37 -0400
> >+++ src/kwset.c 2005-07-04 04:50:38 -0400
> >@@ -404,7 +404,13 @@ kwsprep (kwset_t kws)
> > }
> > /* Build the Boyer Moore delta. Boy that's easy compared to CW. */
> > for (i = 0; i < kwset->mind; ++i)
> >- delta[U(kwset->target[i])] = kwset->mind - (i + 1);
> >+ {
> >+ int d = kwset->mind - (i + 1);
> >+
> >+ if (UCHAR_MAX < d)
> >+ d = UCHAR_MAX;
> >+ delta[U(kwset->target[i])] = d;
> >+ }
> > /* Find the minimal delta2 shift that we might make after
> > a backwards match has failed. */
> > c = kwset->target[kwset->mind - 1];
Computing d first, as opposed to using a MIN()
macro straight out, avoids relying on the
compiler's optimizer to recognize and not compute
"kwset->mind - (i + 1)" twice.
> >along with tests that would be surrounded by a
> >
> > for mode in F G E
> >
> >loop in "tests/foad1.sh".
>
> Please could you write those tests?
# Test for Boyer-Moore bugs. Thanks to Steve Summit for reporting the bug
# and providing a test case which inspired these.
b17='bbbbbbbbbbbbbbbbb'
b85="$b17$b17$b17$b17$b17"
b255="$b85$b85$b85"
x16='xxxxxxxxxxxxxxxx'
x64="$x16$x16$x16$x16"
x256="$x64$x64$x64$x64"
bm="cbbba${b255}c"
bt="${x256}${bm}${x16}xx/"
for mode in F G E; do
grep_test "a${b255}a/" '' "a${b255}b" -$mode
grep_test "$bt" "$bt" "$bm" -$mode
done
> In another email you wrote:
> >(I still don't understand the second test case,
Here are the original tests (which differ by
their terminating slash).
First test:
grep_test "a${b255}a" "" "a${b255}b"
Second test:
grep_test "${x256}${bm}${x16}xx" "${x256}${bm}${x16}xx/" "$bm"
It's just that I haven't figured out for myself
what's the buggy behavior that the second test
is after (and that the first one isn't already
covering).