[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 2/2] regex: test for buffer overrun
From: |
Dmitry V. Levin |
Subject: |
Re: [PATCH 2/2] regex: test for buffer overrun |
Date: |
Thu, 11 Apr 2013 08:48:06 +0400 |
On Sun, Mar 31, 2013 at 04:11:41PM +0100, Nix wrote:
> On 30 Jan 2013, Paul Eggert spake thusly:
>
> > + /* This test is from glibc bug 15078.
> > + The test case is from Andreas Schwab in
> > +
> > <http://www.sourceware.org/ml/libc-alpha/2013-01/msg00967.html>.
> > + */
> > + static char const pat[] = "[^x]x";
> > + static char const data[] =
> > + "\xe1\x80\x80\xe1\x80\xbb\xe1\x80\xbd\xe1\x80\x94\xe1\x80"
> > + "\xba\xe1\x80\xaf\xe1\x80\x95\xe1\x80\xbax";
> > + re_set_syntax (0);
> > + memset (®ex, 0, sizeof regex);
> > + s = re_compile_pattern (pat, sizeof pat - 1, ®ex);
> > + if (s)
> > + result |= 1;
> > + else if (re_search (®ex, data, sizeof data - 1,
> > + 0, sizeof data - 1, 0)
> > + != 21)
> > + result |= 1;
>
> I note that a glibc 2.17 with 7e2f0d2d77e4bc273fe00f99d970605d8e38d4d6
> and a445af0bc722d620afed7683cd320c0e4c7c6059 (Andreas's fix) applied
> does not crash on this test -- but does not appear to work as it expects
> (or as I'd expect) either, returning 0.
The data used for this test is not an arbitrary 25-byte sequence, it is
actually an 8-character collating element
<U1000><U103B><U103D><U1014><U103A><U102F><U1015><U103A>
defined in localedata/locales/iso14651_t1_common as <MM_FirstPerson>,
followed by character 'x'.
According to POSIX, "It is unspecified whether a non-matching list
expression matches a multi-character collating element that is not matched
by any of the expressions."
The difference between glibc and gnulib implementations of re_search() is
that glibc's re_search() matches multi-character collating elements in
that case while gnulib's re_search() doesn't.
For details see e.g. re_string_elem_size_at().
I think gnulib's tests/test-regex.c should allow glibc's re_search()
behavior so that various utilities built --without-included-regex
wouldn't be penalized by test-regex:
diff --git a/tests/test-regex.c b/tests/test-regex.c
index 5a94c14..15542f4 100644
--- a/tests/test-regex.c
+++ b/tests/test-regex.c
@@ -79,17 +79,28 @@ main (void)
*/
static char const pat[] = "[^x]x";
static char const data[] =
- "\xe1\x80\x80\xe1\x80\xbb\xe1\x80\xbd\xe1\x80\x94\xe1\x80"
- "\xba\xe1\x80\xaf\xe1\x80\x95\xe1\x80\xbax";
+ /* <U1000><U103B><U103D><U1014><U103A><U102F><U1015><U103A> */
+ "\xe1\x80\x80"
+ "\xe1\x80\xbb"
+ "\xe1\x80\xbd"
+ "\xe1\x80\x94"
+ "\xe1\x80\xba"
+ "\xe1\x80\xaf"
+ "\xe1\x80\x95"
+ "\xe1\x80\xba"
+ "x";
re_set_syntax (0);
memset (®ex, 0, sizeof regex);
s = re_compile_pattern (pat, sizeof pat - 1, ®ex);
if (s)
result |= 1;
- else if (re_search (®ex, data, sizeof data - 1,
- 0, sizeof data - 1, 0)
- != 21)
- result |= 1;
+ else
+ {
+ i = re_search (®ex, data, sizeof data - 1,
+ 0, sizeof data - 1, 0);
+ if (i != 0 && i != 21)
+ result |= 1;
+ }
}
if (! setlocale (LC_ALL, "C"))
If the change is OK, I'll add a commit message.
> Introducing a single spurious
> character after the first byte of 'data', like so:
>
> static char const data[] =
> "\xe1""a\x80\x80\xe1\x80\xbb\xe1\x80\xbd\xe1\x80\x94\xe1\x80"
> "\xba\xe1\x80\xaf\xe1\x80\x95\xe1\x80\xbax";
>
> changes the return value of re_search() to 16, still not right.
> (This is with locale set to en_US.UTF-8, just as in the original glibc
> testcase, which *does* pass.)
Your "single spurious character" broke that long collating element. The
last 3 characters of it (<U102F><U1015><U103A>) appeared to be another
collating element called <MMV102FK1015A>, so glibc's re_search() is
perfectly correct to return 1+5*3 in that case.
--
ldv
pgpcDBTNkow4O.pgp
Description: PGP signature
- Re: [PATCH 2/2] regex: test for buffer overrun, Nix, 2013/04/01
- Re: [PATCH 2/2] regex: test for buffer overrun, Carlos O'Donell, 2013/04/09
- Re: [PATCH 2/2] regex: test for buffer overrun,
Dmitry V. Levin <=
- Re: [PATCH 2/2] regex: test for buffer overrun, Paul Eggert, 2013/04/11
- Re: [PATCH 2/2] regex: test for buffer overrun, Dmitry V. Levin, 2013/04/11
- [PATCH] regex-tests, regex: allow glibc re_search behavior, Dmitry V. Levin, 2013/04/11
- Re: [PATCH] regex-tests, regex: allow glibc re_search behavior, Paul Eggert, 2013/04/11
- Re: [PATCH] regex-tests, regex: allow glibc re_search behavior, Dmitry V. Levin, 2013/04/11