emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 23.0.60; Segmentation fault loading auto-lang.el


From: Stefan Monnier
Subject: Re: 23.0.60; Segmentation fault loading auto-lang.el
Date: Tue, 08 Apr 2008 21:42:14 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux)

>>> (let ((str (string-as-unibyte "ä")))
>>> (string-match (char-to-string (string-to-char str)) str))
>> 
>>> evaluates to 0 in Emacs 22, and to nil in Emacs 23.  It turns out that
>>> this screws up the use of all-completions in regexp-opt-group.
>> 
>>> Anyone have any idea what's going on here?
>> 
>> (string-as-unibyte "ä") => "\303\244"
>> (string-to-char "\303\244") => 195 (because ?\303 == 195)
>> (char-to-string 195) => "Ã" (because 195==0xC3 U+00C3=='Ã')
>> (string-match "Ã" "ä") => nil (obvious)
>> 
>> Any Lisp program that depends on the result of
>> string-as-unibyte (thus Emacs' internal character
>> representation) won't work in Emacs 23.

Notice that the problem is unrelated to string-as-unibyte:

   (string-match (char-to-string (string-to-char str)) str)

this should intuitively always return 0.  Of course, once you replace
`char-to-string' with just `string', you may be reminded that Emacs-23
introduced `unibyte-string', which leads you to the key, if `str' is
unibyte, you need to do

   (string-match (unibyte-string (string-to-char str)) str)

In Emacs-22, `string' used a heuristic to decide whether to build
a unibyte or multibyte string, and more importantly, the character
representing byte code 209 had code 209, whereas in Emacs-23, we have
the strange situation that byte 209 is character 4194257.

So an integer <256 needs to be accompagnied with some contextual info
that says whether it represents a char or a byte, otherwise you get
ambiguity that lead to bugs.  And string-to-char returns either a byte
or a char depending on whether the string was unibyte or multibyte.

> I see.  However, maybe the following change to regexp-opt-group in
> regexp-opt.el would make things a little more predictable.  What do you
> think?

Yes, it looks like a good fix.  Maybe "-no-properties" would be even
better.


        Stefan






reply via email to

[Prev in Thread] Current Thread [Next in Thread]