[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gnu-libiconv] Updating iconv tables
From: |
Bruno Haible |
Subject: |
Re: [bug-gnu-libiconv] Updating iconv tables |
Date: |
Fri, 13 Jun 2008 13:34:30 +0200 |
User-agent: |
KMail/1.5.4 |
> And indeed the character that you meant to show me (bytes 0xAD 0xEA)
> in EUC-JISX0213 is U+3231. In EUC-JISX0213, but not in EUC-JP.
Also, when you look at all tables in
http://www.haible.de/bruno/charsets/conversion-tables/EUC-JP.html
you see that a mapping 0xAD 0xEA -> U+3231 is also present in
jdk-1.4.2/EUC-JP-SOLARIS.INVERSE.TXT
jdk-1.5.0/EUC-JP-SOLARIS.INVERSE.TXT
glibc-2.3.6-iconv/EUC-JP-MS.TXT
Since you say that you are using Solaris, probably you fell into a
portability pitfall of EUC-JP-SOLARIS. It has nothing to do with JISX 0213.
Simply yet another vendor extensions.
Since you can imagine that we iconv developers for GNU won't replicate
every possible vendor extensions in GNU iconv, Linux users have no way
of performing the same conversion step from EUC-JP-SOLARIS to UTF-8 on
their machines, as you do on your machines.
Therefore if you distribute EDICT in EUC-JP-SOLARIS encoding, it is only
useful to Solaris users. If you distribute it in UTF-8 encoding, it is
useful to everyone. That's what you get as a result of using a system
whose vendor has "extended"/"enhanced" the EUC-JP encoding on its system!
Bruno