bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] Updating iconv tables


From: Bruno Haible
Subject: Re: [bug-gnu-libiconv] Updating iconv tables
Date: Fri, 13 Jun 2008 13:34:30 +0200
User-agent: KMail/1.5.4

> And indeed the character that you meant to show me (bytes 0xAD 0xEA)
> in EUC-JISX0213 is U+3231. In EUC-JISX0213, but not in EUC-JP.

Also, when you look at all tables in
  http://www.haible.de/bruno/charsets/conversion-tables/EUC-JP.html
you see that a mapping 0xAD 0xEA -> U+3231 is also present in
  jdk-1.4.2/EUC-JP-SOLARIS.INVERSE.TXT
  jdk-1.5.0/EUC-JP-SOLARIS.INVERSE.TXT
  glibc-2.3.6-iconv/EUC-JP-MS.TXT

Since you say that you are using Solaris, probably you fell into a
portability pitfall of EUC-JP-SOLARIS. It has nothing to do with JISX 0213.
Simply yet another vendor extensions.

Since you can imagine that we iconv developers for GNU won't replicate
every possible vendor extensions in GNU iconv, Linux users have no way
of performing the same conversion step from EUC-JP-SOLARIS to UTF-8 on
their machines, as you do on your machines.

Therefore if you distribute EDICT in EUC-JP-SOLARIS encoding, it is only
useful to Solaris users. If you distribute it in UTF-8 encoding, it is
useful to everyone. That's what you get as a result of using a system
whose vendor has "extended"/"enhanced" the EUC-JP encoding on its system!

Bruno





reply via email to

[Prev in Thread] Current Thread [Next in Thread]