Re: [Groff] Re: groff: radical re-implementation

groff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Re: groff: radical re-implementation

From:	Werner LEMBERG
Subject:	Re: [Groff] Re: groff: radical re-implementation
Date:	Wed, 18 Oct 2000 16:54:53 +0200 (CEST)

> However, thank you for explaining glyph.  I also understand you 
> understand problems on Japanese character codes well. 

Well, I'm the author of the CJK package for LaTeX, I've written a
ttf2pk converter, and I'm a member of the FreeType core team :-)

> Note that CJK ideographs also has distinction between character and
> glyph.  The most famous example is two variants of a 'tall or high'
> character.  Japanese people regard these two as the same in daily
> use but Japanese people regard these two as different if they are
> used in person's names or so on.

I know these problems too well -- AFAIK, in JIS X 0208 these two
variants are unified.  Do you know details about the new JIS X 0213
standard?

> I don't know how Chinese and Korean people treat them.  It may be
> different.  However, IMHO, we should neglect this problem now since
> there are so far no standard to treat these variants properly.
> Though it is important, it is not in our scope.

If you are working on a terminal you need a character set which
distinguishes the two forms.

> > A `glyph code' is just an arbitrary registration number for a glyph
> > specified in the font definition file.
> 
> Then the 'font definition file' will be irrationally large.  I think
> at least CJK ideographics and Korean precompiled Hanguls have to be
> treated in different way.  (Ukai has already pointed this problem.
> jgroff uses 'wchar<EUCcode>' for glyph names of Japanese
> characters.)

Right.  I think I've answered this problem in my last mail (regarding
a `glyphclass' directive in font description files).

> A problem.  When compiled within internationalized OS, the names for
> encodings (for iconv(3) and so on) is implementation-dependent (You
> know, there are many implementation-dependent items in standard
> C/C++ language).  A solution is: we can have a hard-coded
> translation table between implementation-dependent encoding names
> and macro names for -m.  The table must be changed by OS (by
> './configure' script or so).  A minimal table will be translate
> every implementation-dependent encoding names into 'ascii' macro,
> since almost encodings in the world are superset of ASCII.  A full
> table for a OS will cover the list generated by 'iconv --list'.

I don't think so.  For example, we could restrict to MIME character
set tags which are standardized.

> Since the '-m' option is generated by groff and passed to troff,
> groff has to have '#ifdef I18N' code.  (or, the code can be
> integrated to the preprocessor if we design the preprocessor to
> invoke troff.)

Indeed, the default behaviour should be that the preprocessor adds a

  .mso tmac.<charset>

line or something similar to the document, but there must be a
possibility to override it manually.


    Werner

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Groff] Re: groff: radical re-implementation, (continued)

Prev by Date: Re: [Groff] still exploring the last version: what is gremlin ?
Next by Date: Re: [Groff] still exploring the last version: what is gremlin ?
Previous by thread: Re: [Groff] Re: groff: radical re-implementation
Next by thread: Re: [Groff] Re: groff: radical re-implementation
Index(es):
- Date
- Thread