bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] iconv fails to convert utf8 with bom to cp1251


From: Bruno Haible
Subject: Re: [bug-gnu-libiconv] iconv fails to convert utf8 with bom to cp1251
Date: Fri, 08 Dec 2017 00:24:48 +0100
User-agent: KMail/5.1.3 (Linux/4.4.0-101-generic; KDE/5.18.0; x86_64; ; )

Hi,

> > iconv SHOULD not allow a BOM in this conversion
> 
> Should doesn't mean must. Anyway I didn't provide any input encoding,
> only output.

iconv always takes an input encoding. If you didn't specify an explicit
encoding, you implicitly specified the locale's encoding, which under
Linux nowadays most likely is UTF-8.

> So how to escape this problem? I see two options: add another encoding
> called utf8-bom or ignore bom character.

Once you know that the file is in UTF-8+BOM encoding, you need to
strip off the BOM:
  $ tail --bytes=+4 < FILE | iconv -f UTF-8 -t ...

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]