[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: nroff char '0255' bug?
From: |
Werner LEMBERG |
Subject: |
Re: nroff char '0255' bug? |
Date: |
Thu, 03 Jul 2003 18:05:28 +0200 (CEST) |
> The byte-code 0xAD is octal 255, and in Latin1 encoding it
> represents a kind of hyphen character. A very easy way to create
> such a file is
> to catch the output of: echo axb | tr 'x' '\255'
> A full demo is easy as well: echo axb | tr 'x' '\255' | nroff | more
> or: echo axb | tr 'x' '\255' | nroff | od -c
>
> The output of nroff (1.18) for this input-data shows the 'a' and the
> 'b', but nothing in between. Previous nroff-versions such as 1.16.1
> showed a proper hyphen between the 'a' and the 'b'.
>From the README file:
o Using the latin-1 input character 0xAD (soft hyphen) for the `shc'
request was a bad idea. Instead, it is now translated to `\%',
and the default hyphenation character is again \[hy]. Note that
the glyph \[shc] is not useful for typographic purposes; it only
exists to have glyph names for all latin-1 characters.
So the new behaviour is the correct one (within the groff universe).
The main function of the soft hyphen in groff is to indicate a
possible hyphenation point -- for groff, 0xAD is a special character
by default.
An excellent discussion on this topic can be found here:
http://www.cs.tut.fi/~jkorpela/shy.html
It has also been recently discussed on the linux-utf8 mailing list.
The conclusion is that you should never use 0xAD...
To change groff's behaviour you can say e.g.
.tr \[char173]\[char173]
.trin \[char173]\[hy]
Then a hyphen is printed for all occurrences of 0xAD.
Werner