[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Some thoughts on glyphs

From: Alejandro López-Valencia
Subject: Re: [Groff] Some thoughts on glyphs
Date: Tue, 27 Aug 2002 07:35:35 -0500

----- Original Message -----
From: "Werner LEMBERG" <address@hidden>
To: <address@hidden>
Sent: Monday, August 26, 2002 5:32 AM
Subject: Re: [Groff] Some thoughts on glyphs

> Dear friends,
> in April I suggested to extend the \[...] escape to support composite
> glyphs.
> I reexamined my old letter and found some deficiencies, so here my
> new proposal.  Please comment.

[scheme explanation; seems consistent and parsimonious to me, no problems
with it. My mind boggles, but Unicode always makes me dizzy, anyway.]

> I've completely dropped the idea that groff does something like
> `\z\[ho]A' automatically if `\[A ho]' is not defined.  Here a revised
> version how a latin2 input encoding could be implemented, assuming
> standard PS fonts:
>   .\" The rather generic .composite calls could be in a file
>   .\" `glyph.tmac' which is always loaded at start-up of groff.
>   .
>   .composite ho u0328
>   .composite ah u030C
>   .composite aa u0301
>   ...

Should we strive to have all the Unicode ranges mapped in glyph.tmac, or
just the Latin-A, Latin-B and extensions (perhaps Cyrillic too) plus the
needed ranges for European/Slavic languages typesetting (math symbols,
dingbats, etc.) and leave the CJKV ranges as optional files to be
loaded on demand (you already said that complex in-context typesetting such
as in Arabic and most Hindi scripts is out of scope)? Perhaps make the
actual Unicode ranges loaded by default a runtime configuration flag with a
sensible default that can be changed with a "configure" variable before
compilation, like paper size and the postscript spooler flags?

That is, CJKV ranges are huge, they would slow down start up a lot, but I
believe they will become a necessary part of the system, see for example Jie
Zhang question today about doing Simplified Chinese typesetting, which takes
us, I think, to the triroff extensions I mentioned a long time ago.

What I like with your proposal, and the input encoding mapping mechanism you

propose, is that someone with enough determination could create input
encoding mappings as big as Big-5 to UTF-8 or a Shift-JIS to UTF-8 encoding
(would UTF-8 be the internal encoding?).

>   .de latin2-tr
>   .  trin \\$1\\$1
>   .  if !c\\$2 \
>   .    if (\n[.$] == 3) \
>   .      char \\$2 \\$3
>   .  if !c\\$1 \
>   .    trin \\$1\\$2
>   ..
>   .
>   .latin2-tr \[char161] "\[A ho]" "\o'A\[ho]'"
>   .latin2-tr \[char162] \[ab]
>   .latin2-tr \[char163] \[/L]
>   .latin2-tr \[char164] \[Cs]
>   .latin2-tr \[char165] "\[L ah]" "\o'L\[ah]'"
>   .latin2-tr \[char166] "\[S aa]" "\o'L\[aa]'"
>   ...

And talking about the input encoding to Unicode mappings... I think they
should be configurable at runtime with a default determined at compilation
time too. I see this as advantageous to people who actually use an input
encoding different to Latin-1 (I can think of most other ISO-8859 encodings,
Windows code pages, KOI-8, MacOS encodings under OSX, CJKV encodings, and
non  standardized encodings such as Georgian and Armenian).

reply via email to

[Prev in Thread] Current Thread [Next in Thread]