bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

A little more regex.h pedantry


From: Reuben Thomas
Subject: A little more regex.h pedantry
Date: Fri, 30 Jul 2010 23:09:01 +0100

Sigh. I've been picking my way through this paragraph, looking at the code:

/* This data structure represents a compiled pattern.  Before calling
   the pattern compiler, the fields `buffer', `allocated', `fastmap',
   `translate', and `no_sub' can be set.  After the pattern has been
   compiled, the `re_nsub' field is available.  All other fields are
   private to the regex routines.  */

We observed earlier that there is an omission on the "After" side:
not_bol and not_eol are respected during pattern matching (the
equivalent of POSIX eflags).

What I have only just noticed, and confirmed from the code, is that
the list of fields that can be set before compilation is excessive. In
paticular, `fastmap' can't be set (you have to call
re_compile_fastmap, and `no_sub' can't be set (because re_compile
always overwrites it, as it does newline_anchor).

Does this analysis look right? I therefore claim that a correct
version of the paragraph is:

/* This data structure represents a compiled pattern.  Before calling
   the pattern compiler, the fields `buffer', `allocated', and
   `translate' can be set.  After the pattern has been compiled, the
   fields `re_nsub', `not_bol' and `not_eol' are available.  All other
   fields are private to the regex routines. */

There's one more thing that seems dubious to me, and that's the
mention of allocate. Surely you shouldn't be fiddling with it
directly, but calling re_set_registers (and similarly fastmap and
re_compile_fastmap)? The API function re_set_registers has been
available since June 1992 (or perhaps that should be release 0.12 in
June 1993), and re_compile_fastmap since before the regex ChangeLog
began, i.e. since at least October 1989 (though no later release is
documented before 0.12). Still, that seems long enough to at least
deprecate the manual setting of those particular fields, at least as
mildly as by not mentioning them in the documentation. Hence, I would
write:

/* This data structure represents a compiled pattern.  Before calling
   the pattern compiler, the fields `buffer' and `translate' can be set.
   After the pattern has been compiled, the fields `re_nsub',
   `not_bol' and `not_eol' are available.  All other fields are private
   to the regex routines. */

Again, I would be most grateful for scrutiny and suggestions so I can
get the highest quality patch for glibc.

-- 
http://rrt.sc3d.org



reply via email to

[Prev in Thread] Current Thread [Next in Thread]