[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Codepage?

From: Antoine Leca
Subject: Re: Codepage?
Date: Fri, 28 Apr 2000 16:57:37 +0200

Rob Kramer wrote:
> My application uses FreeType to show fonts. The text shown comes from a file
> that was cut-and-pasted from WinWord for example (taking care that the text
> is not 7th-bit stripped in the process).

You mean, your application displays (using FreeType) a chunk of text
that ultimately comes from a Cut/Copy operation done in Windows Word.
Do I get it right?

So in fact, the text is 8-bit encoded.

So you need a way to convert from the encoding used in Windows Word
(I assume you, or your user, knows what it is) to transform to
character indices as stored in the font (and then to glyph indices
using some TT_CharToIndex function).
Do I still get it right so far?

> If the user types Thai text in Word using the 'PSL Irene' font (psli.ttf),
> this should somehow appear correctly in my application.

Don't know that particular font, but it does not matter I believe.

> I figured out that the codepage that font uses is actually the 'Symbol' page
> at 0xf000 (I believe). So if the user wants to display using psli.ttf, he
> has to specify the fontname and the codepage.

Well, that is one possible solution: mapping the whole 8-bit
encoding to a "symbol" font. Same is done with eg. MS Line Draw
(for codepage 437, the standard PC MS-DOS page).

> I had problems with a Russian font because I couldn't see what the codepage
> was, and I've never seen a codepage 1200 to be honest.

Codepage 1200 is Unicode, you know it, the ultimate 16-bit encoding...
where 0x0E01 is Thai ko kai, 0x410 is Russian A, etc.
Most fonts, and particularly "recent" fonts, are encoded with this
scheme. As a result, it is part of your application's job to transform
from the 8-bit format you receive to the 16-bit Unicode format.
Depending of your platform, the job is more than probably already done,
but the particular solution you have to use (iconv, mbs[r]towcs,
MultiByteToWideChar, recode, ...) is dedicated.

> > > Right now I have to get the user to specify the codepage if he wants it
> > > to be anything other than zero, but users usually don't understand much
> > > about codepages..
> >
> > And Freetype does not understand much either!
> Do you mean some Freetype users? :)

No, I meant that Freetype, as a library, cannot deal with the encoding
issues like what I explained above. The job is up to caller.
This was chosen at design time for a number of reason, one of them is
that there exist a wide range of encoding in use (which will mean
a fat library), to be combined with a wide range of already available
and overused solutions (which means that the fat part I was talking
about is likely to be useless to a big number of users...)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]