[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: OpenType Layout code
From: |
Werner LEMBERG |
Subject: |
Re: OpenType Layout code |
Date: |
Fri, 02 Jun 2000 07:37:30 +0000 (GMT) |
> > > There is however something that I find hard to understand: why
> > > is there a need to "update" the glyph properties each time a
> > > lookup is performed? I know that the code really assigns
> > > properties to glyphs that have none in the original GDEF
> > > table, but I fail to see in the OpenType specification a
> > > paragraph about it. Could you tell me where the relevant
> > > piece of information is to be found? I'd like to know the
> > > exact details and conditions in order to experiment with other
> > > kinds of implementations/ assignments...
> >
> > I fear that you won't find any detailed information on properties.
> >
> > There are two kinds of properties. The first one is what you find
> > in a GDEF table, if there is one: Let's call this `glyph
> > properties'. For example, if you have the ligature `x' `y'->`a',
> > then the data in the GDEF table is used to assign a glyph property
> > to ligature `a'. If there is no GDEF table (as in trado.ttf), my
> > code guesses that after a ligature substitution, the resulting
> > glyph is of type `ligature'. Note that support for user-defined
> > GDEF data is only implemented to support the marvellous trado
> > font, since older Arabic versions of Windows (and, I believe, even
> > the new OTL engine) have hard-coded support for this font (as Paul
> > Nelson from MS told me). I hope that my solution is generic
> > enough to support other old fonts also -- nevertheless, I doubt
> > that you'll find another OT font without a GDEF table.
> >
> > The other kind is what I call `user properties' or `features'.
> > For example, Arabic has four glyph classes, namely isolated,
> > initial, medial, and final, so the `fina' lookup should affect
> > only final glyphs, the `medi' only medial glyphs, etc. This isn't
> > documented at all in the OT specification (it is just mentioned
> > `that you have to keep track' of some data)! It is a consequence
> > of the implementation, and maybe you are able to find a better
> > solution how to handle this.
> >
> > After you have applied a lookup you must decide which glyphs are
> > affected by the lookup. My solution is to propagate the user
> > property value of the *first glyph* to all the other glyphs.
> > Example: decomposition of the ligature `fl'->`f' `l'. `fl' has
> > user property `foo', so I assign `foo' to `f' and `l' also.
> >
> > Does this answer your question?
> >
> You bet it does :-) However, I'll rephrase your answer to be sure
> that we're speaking about the same thing, this topic seem rather
> complex:
>
> - first of all, you're saying that some key important points of the
> OpenType Layout are simply undocumented. That's nasty, to be
> polite, but we'll have to deal with it given that you cleverly
> managed to get it right. Congratulations for your work, by the
> way :-)
Basically, it is not that nasty. The very problem of the OpenType
specification is that the people at Microsoft (or wherever) apparently
did an implementation first, then documenting it, and many important
aspects of OT is only comprehensible if you know how this
implementation works. Without the help of Andrei Burago from MS I
wouldn't have succeeded.
> - apparently, there are two kinds of glyph properties:
>
> * "static" properties, that are normally defined in the GDEF
> table. An important thing is that not all glyphs have a
> property defined for them in this table..
This is not correct. If there is a GDEF, then *all* glyphs are
in one of the following classes:
unclassified -- a `normal' glyph
simple -- not a ligature
ligature
mark -- an accent-like glyph
component -- another way to classify elements of a ligature. Its
use is *very* obscure, and usually it's not needed.
The difference between `unclassified' and `simple' is also quite weak
since in most cases `unclassified' is treated similarly to `simple'.
The interesting case is if there is no GDEF table. Then the user has
to supply GDEF data, and this is only possible for glyphs which have a
valid charmap entry if you want to be independent from a specific
font. This means that a lot of glyphs don't get a proper
classification.
> However, these definitions/properties can change during the
> layout process (according to your code)
Only for fonts which have no built-in GDEF table. Otherwise, it is
fixed.
> * "dynamic" properties, that are applied to glyphs during the
> layout process, based on the static ones and the lookups,
> substitutions or positioning that occured.
This isn't `dynamic', it is user-defined, and it is *static* also. It
is *not* related to the GDEF data (which you call `static'). It is
related to the OT features only. User-defined properties simply say
which feature will be applied to which input character. Example: For
Arabic, the user has to categorize all input characters to which the
`fina' feature applies. Usually, such characters are `final
characters' as defined in any Arabic grammar. You can directly use
the data from the Unicode book (but please use the explanation of the
Unicode 3.0 book since the 2.0 book has interchanged `left' with
`right' almost everywhere).
> - moreover, the high-quality Arabic "pardo.ttf" doesn't contain a
> GDEF table, which is why the test program "ftstrtto" in FT 1.x
> does indeed create one "by hand" to describe Arabic properties.
This is `trado.ttf' :-) Note that the GDEF data doesn't describe
`Arabic' properties, but simple glyph properties: whether it is an
accent, or a ligature, etc. `Arabic' properties are the user-defined
ones.
> - currently, when the code performs layout, it modifies the dynamic
> properties, as well as the static ones.
No. My code only guesses the glyph properties for fonts without a
GDEF table. Example: If a GSUB table makes a ligature out of single
glyphs, and this ligature hasn't a GSUB entry already (usually because
it isn't available by the charmap) this ligature get a `ligature'
property.
> If this is correct, I'm wondering if it would not be possible to do
> the following:
>
> compute as many static properties as possible when loading the
> OpenType tables. For example, it should be possible to parse
> the substitution sub-tables to see that "f" + "l" can give "fl".
> Then, we're able to assign a static property to "fl" it is
> hasn't one already..
>
> This would augment the GDEF table a bit, but this work should
> only be done at font open time. We could also hook the
> additional Arabic properties there.
For user-defined GDEF tables (i.e., support for a single font, namely
trado.ttf) , this is really too much.
It might be worth to consider some caching for GSUB values, but for
GPOS I think it won't work efficiently.
Generally, OT support isn't intended to be done at font open time, as
Andrei Burago told me (this causes a lot of other problems if you
don't implement it like this -- and there are indeed gray spots in the
specs like the `forgotten' update of the version number of the GDEF
header table, which makes the parsing for MarkAttachClassDef clumsy).
If an OT table is needed, it is simply read in and parsed at run time.
> - use dynamic properties for the layout process, but without a
> need to modify the static ones. I believe this could greatly
> speed things
Please elaborate.
> Do you think this is feasible, or am I missing something important?
I hope that I've clarified the facts. It would be a good idea to do
some profiling of the OT support to really found out the `hot spots'.
Werner