bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Plain text matching with regex: RE_PLAIN


From: Reuben Thomas
Subject: Plain text matching with regex: RE_PLAIN
Date: Wed, 15 Sep 2010 01:27:17 +0100

With my "maintainer of GNU Zile" hat on, I was improving the searching
code recently, and a thought struck me which has often struck me
before: the code would be much simpler if I could do non-regex
searches using the regex APIs. In particular, I had written (simple)
text searching routines, and had both to maintain them, and decide
whether to use them or regex for each search.

Since I have recently been looking at the regex code quite a lot, I
thought this time I would see how easy it might be to implement.

The answer is: very easy indeed.

The attached patch adds a new syntax flag, RE_PLAIN. Although the
patch looks quite long at first glance, it is mostly reindentation.
The code consists of a #define of RE_PLAIN (thus reducing the number
of spare flag bits on a 32-bit machine from 6 to 5, but I think this
is a justifiable use of a flag), and two changes to the function
peek_token in regcomp.c, one to add a test of RE_PLAIN to an existing
if, and another to wrap an entire block in such a test. The idea is
very simple: when RE_PLAIN is used, the parser is prevented from
assigning any type other than CHARACTER to a token, and no parsing
routine beyond peek_token is ever called.

I saved about 50 lines of C in Zile for the cost of these 3, which
seems like a good start...

If this feature is approved, I would of course write a documentation
patch (for regex.texi; my patch already includes documentation in
regex.h) to go with my code patch.

There is one other potential advantage to adopting this patch, even if
it is a rather odd one: currently, gnulib uses a small set of tests to
determine whether or not to use the system regex. With the addition of
this new feature, this set of tests could be replaced by a simple test
for RE_PLAIN. (Of course, as and when further bugs are fixed, it would
be desirable to add more tests, but this new feature provides a nice
epoch.)

I have written some autoconf code to test for this feature which can
already be used, based on the existing test, thus:

dnl If system lacks RE_PLAIN, force --with-included-regex
AC_MSG_CHECKING([whether system regex.h has RE_PLAIN])
AC_COMPILE_IFELSE(
  [AC_LANG_PROGRAM(
    [AC_INCLUDES_DEFAULT[
     #include <regex.h>
     ]],
    [[reg_syntax_t syn = RE_PLAIN;]])],
 [AC_MSG_RESULT([yes])],
 [AC_MSG_RESULT([no])
 with_included_regex=yes],
 dnl When crosscompiling, force included regex.
 [AC_MSG_RESULT([no])
 with_included_regex=yes])

In GNU Zile's configure.ac, I place this code directly before gl_INIT,
as gl_INIT runs the code that decides whether to use the system regex
or gnulib's copy.

-- 
http://rrt.sc3d.org

Attachment: 0010-Add-RE_PLAIN-flag-to-match-plain-text-patterns.patch
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]