emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[BUG] Regexp compiler, problem with character classes


From: Johan Bockgård
Subject: [BUG] Regexp compiler, problem with character classes
Date: Wed, 06 Sep 2006 10:46:30 -0000
User-agent: Gnus/5.110005 (No Gnus v0.5) Emacs/22.0.50 (gnu/linux)

[I'm resending this because I think it's a serious bug. It makes
character classes totally unreliable.]

Character classes are translated to character alternatives during the
regexp compile phase. This is wrong, since the syntax table should be
taken into account during the actual matching. This may be non-trivial
to fix.


    (with-temp-buffer
      (list
       (progn (modify-syntax-entry ?a " ")
              (string-match "x[[:space:]]" "xa"))
       (progn (modify-syntax-entry ?a "w")
              (string-match "x[[:space:]]" "xa"))))
    => (0 0)



0:      /exactn/1/x
3:      /charset [\t\f a\302\200-\303\277]
37:     /succeed
38:     end of pattern.

Compiling pattern: x[[:space:]]

Compiled pattern: 
38 bytes used/174 bytes allocated.
fastmap: x
re_nsub: 0      regs_alloc: 0   can_be_null: 0  no_sub: 0       not_bol: 0      
not_eol: 0      syntax: 340204
0:      /exactn/1/x
3:      /charset [\t\f a\302\200-\303\277]
37:     /succeed
38:     end of pattern.
0:      /exactn/1/x
3:      /charset [\t\f a\302\200-\303\277]
37:     /succeed
38:     end of pattern.



As an effect you get the behavior below, since the compiler takes no
care to setup the syntax in the first place:


1)

    emacs -Q

    (with-temp-buffer
      (string-match "x[[:space:]]" "x\n"))

    => nil

(exit Emacs)


2)
    emacs -Q

    (with-temp-buffer
      (char-syntax ?\n)
      (string-match "x[[:space:]]" "x\n"))

    => 0


(Fchar_syntax does
    gl_state.current_syntax_table = current_buffer->syntax_table;)

-- 
This is bad.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]