[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gnu-libiconv] Re: 3 char from UTF-8 to MacRoman iconv
From: |
Bruno Haible |
Subject: |
[bug-gnu-libiconv] Re: 3 char from UTF-8 to MacRoman iconv |
Date: |
Wed, 2 Jul 2008 13:00:31 +0200 |
User-agent: |
KMail/1.5.4 |
Hi,
address@hidden wrote:
> The multiple unicode codepoints for 0xBD and 0XDB will result two
> different unicode strings to be translated into the same MACROMAN
> string, making the "return trip" ambiguious. I am curious though
> since libiconv already does make a decisive choice when going from
> MACROMAN to UTF8(instead of rejecting those characters),
> wouldn't it make sense for it to choose the same consistent
> behavior from UTF->MACROMAN?
libiconv currently only implements a 1:1 conversion, exactly as listed in the
file libiconv/tests/MacRoman.TXT. I'm also not so much a fan of mapping two
different Unicode code points to the same byte value; because of the round-trip
problem, as you say. It's safer to tell the user clearly that a certain Unicode
code point (such as U+20AC) is not supported in the particular character set.
> I am still unclear about the motivation behind Apple Logo,
> because even when I am on a linux system(which I am)
> it's private-use U+F8FF should still get translated into
> ASCII 240(0xF0). should it?
It was not done this way in the MAC-ROMAN mapping table published on
ftp.unicode.org around 2000.
Bruno