[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] mom : unicode in .INCLUDE'd files
From: |
Ralph Corderoy |
Subject: |
Re: [Groff] mom : unicode in .INCLUDE'd files |
Date: |
Fri, 21 Jul 2017 11:30:00 +0100 |
Hi Erich,
> When I enter unicode, like:
>
> ÄÖÜ SS ÒÓÔÕŎŌ Ç äöü ß òóôõŏō ç
>
> ...and process them with pdfmom, they show up perfectly. But if I
> include the same characters in a file with the .INCLUDE macro, they
> disappear.
Those are Unicode codepoints, but what encoding are you using to
represent them in a file as bytes? Is it UTF-8? Only `Ŏ', U+014E,
isn't in ISO 8859-1, AKA Latin1.
> Processed with -P-bcu -Tutf8, they show up like wrong encoded strings.
troff(1) reads files of ISO 8859-1. It sounds like, in this particular
test, you're giving it bytes of UTF-8 that it's trying to interpret as
ISO-8859-1.
U+00A3 is a `£'. In UTF-8, it's two bytes; the 0a is the linefeed.
$ hd <<<£
00000000 c2 a3 0a |...|
iso-8859-1(7) shows c2 is `Â' and a3 is `£' and that's how groff
interprets these bytes.
$ groff -Tutf8 <<<£ | grep .
£
> I tried, in vain, the following pipe:
>
> soelim example.mom | preconv -eutf8 |
> groff -mom -Tutf8 -P-bcu > example.txt
As Denis said, soelim(1) looks for `.so' lines. `.INCLUDE' means
nothing to it.
http://git.savannah.gnu.org/cgit/groff.git/tree/src/preproc/soelim/soelim.cpp#n169
You could try replacing `.INCLUDE' with `.so'.
--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy
- [Groff] mom : unicode in .INCLUDE'd files, E. Hoffmann, 2017/07/20
- Re: [Groff] mom : unicode in .INCLUDE'd files, Peter Schaffter, 2017/07/20
- Re: [Groff] mom : unicode in .INCLUDE'd files,
Ralph Corderoy <=
- Re: [Groff] mom : unicode in .INCLUDE'd files, Peter Schaffter, 2017/07/21
- Re: [Groff] mom : unicode in .INCLUDE'd files, Ralph Corderoy, 2017/07/21
- Re: [Groff] mom : unicode in .INCLUDE'd files, Peter Schaffter, 2017/07/21
- Re: [Groff] mom : unicode in .INCLUDE'd files, Ralph Corderoy, 2017/07/22
- Re: [Groff] mom : unicode in .INCLUDE'd files, Keith Marshall, 2017/07/22
- Re: [Groff] mom : unicode in .INCLUDE'd files, Ralph Corderoy, 2017/07/23
- Re: [Groff] mom : unicode in .INCLUDE'd files, Mike Bianchi, 2017/07/23
- Re: [Groff] mom : unicode in .INCLUDE'd files, John Gardner, 2017/07/23
- Re: [Groff] mom : unicode in .INCLUDE'd files, Ralph Corderoy, 2017/07/23
- Re: [Groff] mom : unicode in .INCLUDE'd files, John Gardner, 2017/07/23