[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gnu-libiconv] Issue when using iconv 2.12 on RHEL 6.7
From: |
Bruno Haible |
Subject: |
Re: [bug-gnu-libiconv] Issue when using iconv 2.12 on RHEL 6.7 |
Date: |
Fri, 07 Apr 2017 19:21:17 +0200 |
User-agent: |
KMail/5.1.3 (Linux/4.4.0-71-generic; KDE/5.18.0; x86_64; ; ) |
Hi,
Lim, Yongkeong wrote:
> I have a data file which we managed to convert using macbook running on
> iconv (GNU libiconv 1.11), no characters got deleted after conversion.
> But when we upload the same file to the RHEL server running on iconv
> (GNU libiconv 2.12), some characters got deleted by the iconv function.
>
> Below is the command we used:
>
> iconv -c -f iso-8859-11 -t utf-8 <source file> > <output file>
The second machine is using iconv from GNU libc, not GNU libiconv.
So, it's two different implementations of the iconv facility.
But both have very similar conversion tables.
For Thai, your file could be in encoding TIS-620, ISO-8859-11, or
Mac-Thai. [1] The conversion tables used by GNU libiconv and GNU libc
for ISO-8859-11 are identical [2], and likewise for TIS-620 [3].
I'd suggest that you
1) Don't use the option "-c" of iconv - this option produces lossy
output by design.
2) Instead, try harder to find the right encoding. That is, try
iconv -f iso-8859-11 -t utf-8 source > output1
iconv -f tis-620 -t utf-8 source > output2
iconv -f macthai -t utf-8 source > output3
and compare the resulting three output files.
Also, in general, ISO-8859-11 should not be used, since it is *not*
standardized - unlike TIS-620, which is a (national) standard. See [4],[5].
Bruno
[1] https://haible.de/bruno/charsets/conversion-tables/Thai.html
[2] https://haible.de/bruno/charsets/conversion-tables/ISO-8859-11.html
[3] https://haible.de/bruno/charsets/conversion-tables/TIS-620.html
[4] https://en.wikipedia.org/wiki/ISO/IEC_8859-11
[5] https://en.wikipedia.org/wiki/Thai_Industrial_Standard_620-2533