bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

better word-list-to-minimized-regexp code?


From: Jim Meyering
Subject: better word-list-to-minimized-regexp code?
Date: Sat, 14 Nov 2009 09:59:13 +0100

You may have seen the tests in maint.mk that help you avoid inclusion
of unused header files, e.g.,

    # Prohibit the inclusion of assert.h without an actual use of assert.
    sc_prohibit_assert_without_use:
            @h='<assert.h>' re='\<assert *\(' $(_header_without_use)

That causes "make syntax-check" to fail if you include <assert.h>
in a file with no "use" of assert.  Of course, it's a naive check,
and can be tricked by #if-0'd or commented-out code, but then again,
leaving in an unnecessary #include is not serious.

I wanted one for xalloc.h, too, and here's the story:

Extracting the symbols is easy (see below), but converting such a long
word list to a regexp is tedious.  So automate that, too.  But with what?
I've tried the perl modules, Regexp::Assemble and Regexp::List,
and neither did what I wanted.

Here's the list of the "x" symbols:

    x2realloc
    xalloc_die
    xalloc_oversized
    xcalloc
    xmalloc
    xmemdup
    xrealloc
    xstrdup
    xzalloc

The modules I tried produced this (or equiv):

    x(alloc_(oversized|die)|([cz]|2?re)alloc|m(alloc|emdup)|strdup)

Both botched the inclusion of "xmalloc".
They missed the fact that sharing both the short prefix of "x" and the
suffix of "alloc" would be better than merely sharing the "xm" prefix.
Doing it by hand, I get this:

    x(alloc_(oversized|die)|([cmz]|2?re)alloc|(mem|str)dup)

If you change the list of inputs via s/xmalloc/xgalloc/,
you see that they get it right:

    x(alloc_(oversized|die)|([cgz]|2?re)alloc|(mem|str)dup)

It appears that they are tricked by a local minimum:
"xm" is a longer shared prefix than merely "x"

Here's what I did to test Regexp::List:

$ {perl -lne '/^# *define (\w+)\(/ and print $1' lib/xalloc.h|grep -v '^__';
perl -lne '/^(?:extern )?(?:void|char) \*?(\w+) \(/ and print $1' 
lib/xalloc.h;}\
  |grep '^x' \
  |perl -MRegexp::List -le \
     'print Regexp::List->new->list2re(<>)'| sed 's/\?://g'
(?-xism:x(alloc_(die|oversized)\
|m(alloc|emdup)\
|((2?re|[cz])alloc|strdup)\
))

For Regexp::Assemble, see the comments below:

>From 9a93371caac4a9440c41d3269b21db585920e4c4 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Sat, 14 Nov 2009 09:53:26 +0100
Subject: [PATCH] maint.mk: Prohibit inclusion of "xalloc.h" without use.

* top/maint.mk (sc_prohibit_close_stream_without_use): New rule.
---
 ChangeLog    |    5 +++++
 top/maint.mk |   17 +++++++++++++++++
 2 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index e23f285..c821d8d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2009-11-14  Jim Meyering  <address@hidden>
+
+       maint.mk: Prohibit inclusion of "xalloc.h" without use.
+       * top/maint.mk (sc_prohibit_close_stream_without_use): New rule.
+
 2009-11-14  John W. Eaton  <address@hidden>

        strftime.h: wrap funtion declaration in extern "C" block
diff --git a/top/maint.mk b/top/maint.mk
index 73ea8ea..34d66e1 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -289,6 +289,23 @@ sc_prohibit_error_without_use:
        re='\<error(_at_line|_print_progname|_one_per_line|_message_count)? 
*\('\
          $(_header_without_use)

+# Don't include xalloc.h unless you use one of its functions.
+# Consider these symbols:
+# perl -lne '/^# *define (\w+)\(/ and print $1' lib/xalloc.h|grep -v '^__';
+# perl -lne '/^(?:extern )?(?:void|char) \*?(\w+) \(/ and print $1' 
lib/xalloc.h
+# Divide into two sets on case, and filter each through this:
+# | sort | perl -MRegexp::Assemble -le \
+#  'print Regexp::Assemble->new(file => "/dev/stdin")->as_string'|sed 
's/\?://g'
+# Note this was produced by the above:
+# _xa1 = x(alloc_(oversized|die)|([cz]|2?re)alloc|m(alloc|emdup)|strdup)
+# But we can do better:
+_xa1 = x(alloc_(oversized|die)|([cmz]|2?re)alloc|(mem|str)dup)
+_xa2 = X([CZ]|N?M)ALLOC
+sc_prohibit_xalloc_without_use:
+       @h='"xalloc.h"' \
+       re='\<($(_xa1)|$(_xa2)) *\('\
+         $(_header_without_use)
+
 sc_prohibit_safe_read_without_use:
        @h='"safe-read.h"' re='(\<SAFE_READ_ERROR\>|\<safe_read *\()' \
          $(_header_without_use)
--
1.6.5.2.372.gc0502




reply via email to

[Prev in Thread] Current Thread [Next in Thread]