[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#16448: 24.3; Messages from (error "...") with UTF-8 chars are printe
From: |
Eli Zaretskii |
Subject: |
bug#16448: 24.3; Messages from (error "...") with UTF-8 chars are printed wrongly in Emacs Lisp scripts |
Date: |
Wed, 15 Jan 2014 17:35:43 +0200 |
> Date: Wed, 15 Jan 2014 08:02:49 +0400
> From: Dmitry Antipov <dmantipov@yandex.ru>
> Cc: 16448@debbugs.gnu.org
>
> On 01/15/2014 04:10 AM, Sergey Tselikh wrote:
>
> > In a script, when (error "...") instruction is executed with some UTF-8
> > characters in its text, the message is not printed correctly.
>
> In batch mode, (error ...) is handled by external-debugging-output, and the
> latter just does:
>
> putc (XINT (character) & 0xFF, stderr);
> ^^^^^^
> To allow multibyte sequences here, we should use something like:
>
> === modified file 'src/print.c'
> --- src/print.c 2014-01-01 07:43:34 +0000
> +++ src/print.c 2014-01-15 03:55:39 +0000
> @@ -709,8 +709,14 @@
> to make it write to the debugging output. */)
> (Lisp_Object character)
> {
> + unsigned char str[MAX_MULTIBYTE_LENGTH];
> + unsigned int ch;
> + ptrdiff_t len;
> +
> CHECK_NUMBER (character);
> - putc (XINT (character) & 0xFF, stderr);
> + ch = XINT (character);
> + len = CHAR_STRING (ch, str);
> + fwrite (str, len, 1, stderr);
This will only work correctly in a UTF-8 locale. In the general case,
we need to run the resulting multibyte sequence through ENCODE_SYSTEM,
before writing it to stderr.
Btw, the way we output text in this case cries for refactoring: we
first assemble individual characters from their multibyte sequences,
then pass those characters one by one to external-debugging-output,
which will now have to unroll each character back into its multibyte
sequence, and encode each character individually. Something for after
the branch, I guess.