bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] gb18030 for 0x215d7


From: Link
Subject: Re: [bug-gnu-libiconv] gb18030 for 0x215d7
Date: Sun, 28 May 2023 04:51:24 +0800


Bruno wrote:
 
> Some characters got mapped to the Unicode PUA, because they were not
in Unicode at that time. Then they got added to Unicode.

> A bijective 1-1 conversion table does not provide the best user experience
in this situation.

Figured out a little history:
GB18030/2000 (up to Ext-A): 0xFE6C -> U+E831 (PUA)
Character adopted by Unicode U+215D7 (Ext-B)
GB18030/2005 adopted Ext-B: 0x9536B937 -> U+215D7

The real question here:
U+215D7 -> GB18030: 0xFE6C or 0x9536B937?
I think 0x9536B937 is the better choice, because Ext-B characters in GB18030 are all coded in 4 bytes.
I don't insist on 1-1 conversion any more since the one in PUA should retire some day.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]