freetype-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Dealing with different character map formats when mapping glyph indi


From: Craig White
Subject: Re: Dealing with different character map formats when mapping glyph indicies to character codes
Date: Tue, 23 May 2023 13:00:26 -0400

I just looked at your code, and based on that, I've been overthinking this.
All I need to do is look up the characters I'm interested in and store the glyph indices, then check for those indices later.  That will break in complex cases, but it's good enough for a first-pass attempt.

On Tue, May 23, 2023 at 12:55 PM Craig White <gerzytet@gmail.com> wrote:
The reason I wanted this reverse mapping is because I am trying to implement adjustments for specific glyphs to improve rendering in the auto-hinter, such as "i", where the dot merges with the stem at low ppem.
I need to know when to apply these adjustments, which requires knowing whether or not the glyph being rendered is "i", or another similarly affected glyph.
Mapping glyphs to character codes would make this easy.

On Tue, May 23, 2023 at 12:40 PM Hin-Tak Leung <htl10@users.sourceforge.net> wrote:
On Tuesday, 23 May 2023, 17:19:46 BST, Craig White <gerzytet@gmail.com> wrote:

> I was looking into how freetype maps character codes to glyph indices, and learned that there are many different formats the character map can be in, not to mention the one-to-many and many-to-one mappings that Werner mentioned.
> Will it be necessary to implement the reverse mapping separately for every cmap format?

Not sure why you need to/want to implement it in Freetype. glyph id is unique per glyph. Some glyphs are not mapped in any character encodings e.g. "symbol fonts with custom encoding vectors" <- there is even a name for such.

Perhaps it is best to STOP thinking about (unicode) characters. Glyphs are shaped drawings with a glyph id, some of them for example, lignatures ("combo characters" like "ff" , "etc"), which correspond to two (unicode) characters. And in Arabic, almost every character have 2 to 4 glyph shapes, called isolated forms and init/medi/fini forms.

I think I actually have a python program which does the reverse-map (for the purpose of dropping some glyphs in the many-to-one scenario). examples/cjk-multi-fix.py in my freetype-py fork ( https://github.com/HinTak/freetype-py/, you might need to switch to the font-diag branch to see it if it is not not the default branch).

The opentype spec / and font tech was created to make looking up in the most frequently used direction (from character encoding to glyph id) fast and easy.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]