[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #51330] preconv fails to detect utf-8 without BOM
From: |
Bertrand Garrigues |
Subject: |
[bug #51330] preconv fails to detect utf-8 without BOM |
Date: |
Tue, 27 Jun 2017 18:31:11 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; rv:54.0) Gecko/20100101 Firefox/54.0 |
URL:
<http://savannah.gnu.org/bugs/?51330>
Summary: preconv fails to detect utf-8 without BOM
Project: GNU troff
Submitted by: bgarrigues
Submitted on: Tue 27 Jun 2017 10:31:10 PM UTC
Severity: 3 - Normal
Item Group: None
Status: Confirmed
Privacy: Public
Assigned to: bgarrigues
Open/Closed: Open
Discussion Lock: Any
Planned Release: None
_______________________________________________________
Details:
(See also comment #1 from bug #50989)
typesetting.pdf (utf-8 file without BOM; contains some characters with French
accents) is not correctly generated in the build tree because LC_ALL=C is
passed: this causes `preconv' to use "latin1" as default encoding, which is
the expected behaviour according to the man page of `preconv', and therefore
characters with accents are not properly handled.
There are several quick fixes to the generation of mom examples:
- Add a BOM to the .mom files.
- Use '-K utf8' instead of just '-k'
- Add a tag to the .mom files.
However it seems to me that `preconv' should not rely on the locale to detect
the file encoding.
Would it make sense to use, for example, libmagic (from the `file' utility) to
make preconv correctly detect the input file encoding?
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?51330>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [bug #51330] preconv fails to detect utf-8 without BOM,
Bertrand Garrigues <=