[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gnu-libiconv] Updating iconv tables
From: |
Jim Breen |
Subject: |
Re: [bug-gnu-libiconv] Updating iconv tables |
Date: |
Fri, 13 Jun 2008 11:08:51 +1000 |
Hi Bruno,
Thanks for your considered reply. That makes things much clearer. I
wish I'd seen the page at
http://www.haible.de/bruno/charsets/conversion-tables/Japanese.html
before.
I can see the need for keeping EUC-JP and EUC-JISX0213 apart. While
the codepoints of the two can be mixed (they don't conflict), the
round-trip from UTF8 would be different for the kanji which were recoded
in JIS X 0213. I am afraid I was blindsided by the way Sun has put some
JIS X 0213 codepoints into their EUC-JP tables. Needless to say they
don't support EUC-JISX0213 as such in their version of iconv.
2008/6/13 Bruno Haible <address@hidden>:
> PS: I have no idea in which encoding your EDICT dictionary now actually is.
> If you started out writing it in EUC-JP and at some point switched to
> using EUC-JISX0213, you may have dozens of entries which are correct in
> EUC-JP but wrong in EUC-JISX0213, and dozens of entries for which it is
> the opposite.
Apart from two very recent additions using JIS X 0213 codepoints, it's all in
EUC-JP. It includes several hundred characters from JIS X 0212. I think I am
going to have to sideline the two new entries until such time as I can
migrate the
whole system to Unicode. While the two new additions were behaving OK on
Solaris systems, they were breaking on Linux ones, which alerted me
that there was
an inconsistency.
Best wishes
Jim
--
Jim Breen
Honorary Senior Research Fellow
Clayton School of Information Technology,
Monash University, VIC 3800, Australia
http://www.csse.monash.edu.au/~jwb/