groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Installing Russian Type-1 Fonts


From: Werner LEMBERG
Subject: Re: [Groff] Installing Russian Type-1 Fonts
Date: Sat, 20 Aug 2011 19:55:24 +0200 (CEST)

>> However, please avoid the term 'AGL compatible'.  We are not
>> talking about glyphs but about characters!
>
> Maybe this confusion is not only my fault also the manual's:
>
>     The distinction between input, characters, and output, glyphs,
>     is not clearly separated in the terminology of groff; for
>     example, the char request should be called glyph since it
>     defines an output entity.

Yes, this is a mess, introduced long before I was involved in groff.
However, this affects the terminology but not the documentation
itself.  AFAIK, I've fixed all places in the doc files to make a clear
distinction between input characters and output glyphs.

> As I now understand, there's no internal representation of
> characters in groff.  There are only input characters and output
> entities.

Actually, it's the same object, containing a reference to both the
input character and the corresponding output glyph.

> The former ones are found on the input stream, sometimes together
> with escape sequences specifying output entities directly -- like
> \[uXXXX] with Russian UTF-8 input after processing by preconv.  The
> latter ones are stored in groff's intermediate output and read in by
> postprocessors.

This is correct, more or less.

> If the postprocessor is not targetting a character-cell device, then
> these output entities are also called glyphs, but they are not to be
> confused with, say, the glyphs of a PostScript font, about which
> groff itself knows nothing and it's the grops postprocessor that,
> using its font-definition files, converts groff's glyphs into
> PostScript glyphs.

Correct.

> The Groff Glyph List (GGL) is just a fixed set of glyph identifiers
> without a predefined mapping either from input characters, which is
> defined by character translation requests like .trin in groff source
> files, or to the symbols in the resulting document, because it is up
> to the postprocessor whether (and how) to interpret them.

Correct.

> It seems to me that the GGL was created to provide a default support
> for 8-bit encodings that would work out-of-the-box,

No.  The GGL has been modeled after the AGL (using similar rules to
construct some glyph names algorithmically) and the LICR, the LaTeX
Input Character Reportoire, which is also a collection of internal
entities.

> and to have meaningful indentifiers for the symbols of the non-ASCII
> part of the Latin-1 encoding, thereby standardizing the names of
> these 8-bit symbols across all postprocessors.

troff always had a lot of glyphs which are neither ASCII nor latin-1!
It's better to avoid the term `8-bit' here since it sounds like there
were a limitation to 256 entities.

> It probably came into existence when the hard-coded dependency on
> Latin-1 was removed, because now the font files had to substitute
> something for glyph names \[char128]-\[char255] which they had
> relied upon.

Yes, and to harmonize all entity names across all devices.  In
particular, devdvi had some conflicting names.

> Am I correct in suggesting that the Adobe Glyph List algorithm is
> used in afmtodit?

Kind of.  It's a simplified one, and the resulting glyph names are
tailored for groff.

>> Contrary to TeX, groff handles hyphenation before the conversion
>> from characters to glyphs has happened (more or less).
>
> More or less, because the input file may already contain escapes for
> addressing output entities directly, in which case groff has to
> convert them to 'phantom' input characters which were never on the
> input yet must be used for hyphenation.

Yes.  Another reason is that the character and glyph representation
share the same structure; this means that there is not a strict

               hyphenation
  character -----------------> glyph

model.

>> > But generally, this map cannot be inversely applied becuase
>> > several input characters may be mapped into one internal
>> > entity. What does groff do in this case?
>>
>> Please give me an example where this is relevant to hyphenation.
>
> An error in the mapping file, like this:
>
>     .trin a\[u0430]
>     .trin b\[u0430]
>
> makes it impossible for groff to calculate the hyphenation code for
> \[u0430], yet otherwise such a setup using UTF-8 input remains fully
> functional.

This is not the example I have expected.  In other words, you don't
have one, which is good :-)


    Werner



reply via email to

[Prev in Thread] Current Thread [Next in Thread]