|
From: | Paul Eggert |
Subject: | bug#24206: 25.1; Curly quotes generate invalid strings, leading to a segfault |
Date: | Sun, 14 Aug 2016 19:04:42 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 |
Eli Zaretskii wrote:
Its multibyteness is entirely in Emacs's imagination.
Sure, but Emacs should not substitute "\342\200\230" for "`". The point of text-quoting-style is to substitute quotes, not byte string encodings of quotes.
> More generally, Fsubstitute_command_keys is quite confused about unibyte > versus multibyte issues. It merges together a number of strings, and > assumes that they are all multibyte iff the original string is > multibyte, which is obviously not true in general.Could you please point out the specific places where this is done?
OK, here's a contrived example. Run this code in emacs-25: (progn (setq km (make-keymap)) (define-key km "≠" 'global-set-key) (substitute-command-keys "\200\\<km>\\[global-set-key]"))This should return a 2-character string equal to "\200≠". But in Emacs 25 it dumps core, at least on my platform (Fedora 23 x86-64). And in Emacs 24 on my platform it returns a malformed string that prints as "\242\1340" but has length 2. I suppose we could make Emacs 24 dump core too, though I haven't tried hard to do that.
The problem is that the older Emacs code incorrectly assumes that the output of substitution must be properly-encoded if the substitution changes something. This assumption can fail if the input is unibyte and contains bytes that are not properly-encoded for UTF-8. (There are other ways the assumption can fail.)
[Prev in Thread] | Current Thread | [Next in Thread] |