[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gnu-libiconv] UTF-8 replacement character cannot be translitera
From: |
Bruno Haible |
Subject: |
Re: [bug-gnu-libiconv] UTF-8 replacement character cannot be transliterated into ISO-8859-1 |
Date: |
Mon, 22 Oct 2007 01:19:35 +0200 |
User-agent: |
KMail/1.5.4 |
Hello Vincent,
Vincent Lefevre wrote:
> As shown by the attached script, the UTF-8 replacement character
> cannot be transliterated into ISO-8859-1 (tested under Mac OS X).
> This problem doesn't occur under Linux with iconv from glibc 2.6.1.
>
> Under Mac OS X (with libiconv built via MacPorts), the attached script
> gives:
>
> iconv (GNU libiconv 1.11)
> Copyright (C) 2000-2006 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions. There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> Written by Bruno Haible.
> éè
> EUR
> iconv: (stdin):3:0: cannot convert
>
> while under Linux, I get:
>
> iconv (GNU libc) 2.6.1
> Copyright (C) 2007 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions. There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> Written by Ulrich Drepper.
> éè
> EUR
> ?
> ...
>
> as expected.
This can be reproduced with libiconv on Linux as well.
The amount of transliteration done is at the discretion of the implementation.
glibc transliterates traditionally more characters; libiconv transliterates
only when the transliteration is well understood by everyone and culturally
neutral.
For U+FFFD European users prefer a question mark '?', whereas CJK users
prefer a U+3013 (GETA MARK) for this purpose.
Bruno