[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Possible UTF-8 CJK Regressions in Terminal Emulators
From: |
Stefan Monnier |
Subject: |
Re: Possible UTF-8 CJK Regressions in Terminal Emulators |
Date: |
09 Jun 2004 05:38:30 -0400 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 |
> As surrogate pair was not handled well by UTF-16 converter,
> I've just fixed it too (not yet installed, I'm now adding
> comments in a code). Untranslatable characters are decoded
> into UTF-8 form represented by the sequence of
> eight-bit-graphic/control characters (the same way as UTF-8
> decoding, thus we can use utf-8-post-read-conversion). The
> UTF-16 encoder encodes such a sequence back to the origianl
> UTF-16 form. So, now the UTF-16 support is at the same
> level as UTF-8.
Does that mean that some sequences of eight-bit-graphic/control are not
encoded into the corresponding raw bytes?
If so, that makes me a bit uneasy, since those special chars were
introduced specifically to handle things like binary input or
bad-byte-sequences and make sure that we at least preserve the raw bytes in
those cases.
Stefan
Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Dave Love, 2004/06/08
Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Kenichi Handa, 2004/06/11