[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Dealing with different character map formats when mapping glyph indi
From: |
Werner LEMBERG |
Subject: |
Re: Dealing with different character map formats when mapping glyph indicies to character codes |
Date: |
Wed, 24 May 2023 04:28:54 +0000 (UTC) |
> Will it be necessary to implement the reverse mapping separately for
> every cmap format?
No. The idea is rather to use HarfBuzz as much as possible, see
`afshaper.c`, and in particular `af_shaper_get_cluster` and
`af_shaper_get_coverage`.
Every OpenType feature consists of one or more lookups. Using
`hb_ot_layout_collect_lookups` and
`hb_ot_layout_lookup_collect_glyphs`, it collects the glyph coverage
for every feature.
These functions should help you set up a one-to-many mapping from
glyph index to input (Unicode) character codes. Some glyphs won't
have such a mapping; however, I think it is not necessary to exclude
GPOS lookups (contrary to blue zone handling); see big comment
starting at line 307 in file `afshaper.c`.
Setting up the one-one mapping cases from glyph index to a single
input character code should be straightforward. However, you also
have to take care of ligatures – I think it makes sense to support
them in the forthcoming database. Assuming that the 'fi' ligature is
in the database, you can use `hb_shape` to check whether input
characters 'f' + 'i' map to a single glyph. This test should be done
for all features handled specially by the auto-hinter (see file
`afstyles.h`, macro `META_STYLE_LATIN`; names like 'c2cp' are OpenType
feature names).
For consistency you should add one or more `af_shaper_*` functions to
create this mapping.
Werner