[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gnu-libiconv] Support question: libiconv on system with glibc?
From: |
Russell McOrmond |
Subject: |
[bug-gnu-libiconv] Support question: libiconv on system with glibc? |
Date: |
Wed, 4 Feb 2009 12:55:21 -0500 (EST) |
User-agent: |
Alpine 2.00 (LFD 1167 2008-08-23) |
I have an environment where I would like to separate off as much of our
application into a chroot() environment as possible. We figured that
using the sepatate libiconv would help, so that we didn't need to bring
into the chroot() environment all of glibc (IE: /usr/lib/gconv , etc).
I have been having a problem getting libiconv to work in this
environment. This is a RedHat Enterprise 4 machine (glibc 2.3.4), trying
to compile libiconv 1.12.
I've tried linking the application ( http://mapserver.org/ ) against
libiconv and I get characters different than I expect. To isolate the
issue I've compiled the application against the glibc iconv, and tried
using the preloadable_libiconv.so (built using a simple `./configure ;
make ; make check` where the check indicates all is fine )
checking build system type... i686-pc-linux-gnu
...
checking byte ordering... little endian
We have data that is encoded in UTF-16 which we are outputing in UTF-8
(very simple transcode), inserted into an HTML template.
The relevant part should be output in UTF-8 as:
<td>Chernozémique</td>
(Note the accented e)
Here is the test using 'od' to show the UTF-8 encoding when using the
glibc version of the iconv functions.
-bash-3.00$ sh ~/test-mapserv.sh | od -c -j247 -N23
0000367 < t d > C h e r n o z 303 251 m i q
0000407 u e < / t d >
0000416
And here is what happens when I use the libiconv version.
-bash-3.00$ export
LD_PRELOAD=/server/downloads/src/libiconv-1.12/lib/preloadable_libiconv.so
-bash-3.00$ sh ~/test-mapserv.sh | od -c -j247 -N48
0000367 < t d > 344 214 200 346 240 200 346 224 200 347 210 200
0000407 346 270 200 346 274 200 347 250 200 356 244 200 346 264 200 346
0000427 244 200 347 204 200 347 224 200 346 224 200 < / t d >
0000447
-bash-3.00$
Does this type of problem seem familiar? Does the 3 byte octal sequence
of 346 224 200 representing an 'e' look familiar? (If I group into 3 I see
the two e's in the third and last group). Does an encoding using 3 bytes
always ending in octal 200 (decimal 128) seem familar? Is something
byte-swapped the wrong way?
Is there something special I need to do when building libiconv to ensure
various character encodings are enabled? Is there a directory equivalent
to gconv that I need to be installing and pointing to with some
configuration variable/file?
Is there something different in the glibc vs libiconv functions where
there may be a bug in the application (mapserver) that is visible with one
library, but not the other?
In case anyone is curious how iconv is being called, the relevant code
is here:
http://trac.osgeo.org/mapserver/browser/trunk/mapserver/mapstring.c#L1504
The variable 'encoding' on input is set to "UTF-16" , so this is a
simple conversion from UTF-16 to UTF-8.
--
Russell McOrmond, Internet Consultant: <http://www.flora.ca/>
Please help us tell the Canadian Parliament to protect our property
rights as owners of Information Technology. Sign the petition!
http://digital-copyright.ca/petition/ict/ http://KillBillC61.ca
"The government, lobbied by legacy copyright holders and hardware
manufacturers, can pry control over my camcorder, computer,
home theatre, or portable media player from my cold dead hands!"
- [bug-gnu-libiconv] Support question: libiconv on system with glibc?,
Russell McOrmond <=