guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Regex parenthesis handling bug


From: Neil Jerram
Subject: Re: Regex parenthesis handling bug
Date: 22 Oct 2001 23:32:23 +0100
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7

>>>>> "Gary" == Gary Houston <address@hidden> writes:

    >> From: Neil Jerram <address@hidden> Date: 21 Oct 2001
    >> 11:18:50 +0100
    >> 
    >> According to the libc manual, all libc regex settings default
    >> to the Emacs behaviour, so unquoted parens should match literal
    >> parens in the match string, while quoted parens indicate
    >> grouping.
    >> 
    >> In Guile, it doesn't work like this...
    >> 
    guile> (string-match "\\(x\\)" "x")
    >> $3 = #f

    Gary> The libguile functions use POSIX interfaces with
    Gary> REG_EXTENDED defined, which makes unquoted parens into match
    Gary> delimiters.  Adding the REG_BASIC flag should reverse it.  I
    Gary> can't find anything about Emacs behaviour in the glibc
    Gary> (2.2.2) manual, [...]

Sorry, I think I was misremembering /usr/include/regex.h, which says:

/* The following bits are used to determine the regexp syntax we
   recognize.  The set/not-set meanings are chosen so that Emacs syntax
   remains the value 0.  The bits are given in alphabetical order, and
   the definitions shifted by one from the previous bit; thus, when we
   add or remove a bit, only one other definition need change.  */
typedef unsigned long int reg_syntax_t;

But it turns out that the setting of re_syntax_options doesn't apply
to POSIX regcomp, but to the GNUish re_compile_pattern function.  If I
write a new `make-emacs-regexp' primitive using re_compile_pattern
rather than regcomp, it gives the desired behaviour.

That still leaves a problem for non-glibc systems, where Elisp regex
support is concerned, but I guess that can be solved by including
source code from Emacs where necessary.

(Note that the available flags for `make-regexp' don't give me what I
want in general:

  (string-match-basic "\\(x\\)" "x") => #("x" (0 . 1) (0 . 1))

is good (i.e. Emacs-compatible), but

  (string-match-basic "ba+c" "abaaac") => #f

is not.)

Thanks,
        Neil





reply via email to

[Prev in Thread] Current Thread [Next in Thread]