bug#12291: [rev 109796] wrong UTF-8 handling

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#12291: [rev 109796] wrong UTF-8 handling

From:	Werner LEMBERG
Subject:	bug#12291: [rev 109796] wrong UTF-8 handling
Date:	Tue, 28 Aug 2012 21:22:26 +0200 (CEST)

> In both cases, user surely see them.

OK.  BTW, the real use-case is a bug in emacs 23.x which prevented
correct conversion from emacs-mule encoding to utf-8, creating such
funnily encoded utf-8 files (I can't repeat this problem with my
recently compiled emacs, so it seems that it has been fixed
meanwhile).

>> Instead, such characters must be converted to correct
>> UTF-8.
> 
> ??? I don't understand what you means by "correct UTF-8".

Sorry, I've meant correct Unicode.  U+1351DE is larger than the
largest valid Unicode value.  As my example demonstrates, the Chinese
character in the file is certainly *neither* a private character nor a
character from GB 18030, so it should be converted to a regular
Unicode value.

> I think the correct behaviour on reading such a file by utf-8 is to
> treat each byte as raw-byte.

Maybe.  I'm not sure how Emacs should behave in reading such files.


    Werner

[Prev in Thread]

Current Thread

[Next in Thread]

bug#12291: [rev 109796] wrong UTF-8 handling, Werner LEMBERG, 2012/08/28
- bug#12291: [rev 109796] wrong UTF-8 handling, Andreas Schwab, 2012/08/28
- bug#12291: [rev 109796] wrong UTF-8 handling, Kenichi Handa, 2012/08/28
  - bug#12291: [rev 109796] wrong UTF-8 handling, Werner LEMBERG <=
    - bug#12291: [rev 109796] wrong UTF-8 handling, Eli Zaretskii, 2012/08/31

Prev by Date: bug#12251: 24.2.50; crash in note_mouse_highlight
Next by Date: bug#12293: setf behavior difference
Previous by thread: bug#12291: [rev 109796] wrong UTF-8 handling
Next by thread: bug#12291: [rev 109796] wrong UTF-8 handling
Index(es):
- Date
- Thread