Comparing GX Line Layout and OpenType layout
by Dave Opstad, Apple Computer, Inc.
Introduction
This document compares the capabilities of the line layout models of
QuickDraw GX and OpenType. The parts of this document are:
- A brief history of the development of GX and
OpenType;
- A technical description of the two formats,
including which parts of the line layout process the various tables are
useful for;
- Where GX has the edge;
- Where OpenType has the edge; and
- A discussion of the System Support issue
While every attempt is made to be fair in the comparison, the fact that
the author was principal architect for GX Line Layout will perforce introduce
instances of bias. Similarly, because of this intimate familiarity with
GX, and much less familiarity with OpenType, there may be cases where OpenType
is capable of effects which weren't recognized. In both of these cases,
the author would appreciate hearing about it.
A Brief History
While the first public release of GX Line Layout occurred when GX itself
was released in 1994, a portion of the actual GX layout algorithm shipped
much earlier than that, in 1991. It was part of the WorldScript I code,
specifically the portion that parsed the 'itl5' resource. A patent (#5,416,898)
was granted on the whole GX line layout procedure in May, 1995.
The first public version of the OpenType layout process (or "TrueType
Open" as it was called then) was in July, 1995, when the 1.0 version
of the specification was published by Microsoft. As Adobe and Microsoft
began working on defining OpenType, Adobe agreed to use the Microsoft table
definitions.
For further details about the two font formats, please see the Apple or Microsoft
web sites.
[Back to Top]
The Technical Details
This section will describe the steps involved in laying out a line of
text, and the support each format has for doing this. Here is a mini-table
of contents:
Character to Glyph mapping
Both GX and OpenType maintain the distinction between a character and
a glyph. The user creates text in some encoding or combination of encodings
(e.g. Unicode, Macintosh, ISO 8859, etc.). In order to process this text,
a line layout processor first must take this text and convert it into a
font-specific form, since the tables in the font which drive the layout
process all need to uniquely identify what they're working on.
Both formats use a 'cmap'
table to control the mapping from character codes to glyph indices. Further
processing is then done on glyph indices.
[Back to Top] [Back to Technical]
Bidirectional Processing
Before any further processing, the line needs to be processed to see
if there are any runs of right-to-left text. If there are, then the line
needs to be reordered, using the Unicode bidirectional reordering algorithm.
This support is built into GX; it does not appear in the OpenType Layout
specification, where it seems to be assumed that the client application
has done this work.
One difference in the way GX and OpenType deal with bidirectional processing
lies in how the directional properties are obtained. GX uses a property table,
which identifies directional and other properties for every glyph in a
font. OpenType relies instead on obtaining the properties from the character
code, specifically the Unicode. This means that OpenType cannot deal with
text which does not have Unicode equivalents.
[Back to Top] [Back to Technical]
Glyph identity changes
Changing the initial set of glyphs, which came out of the character
to glyph mapping step, into a different set of glyphs is handled in this
step. An example is the formation of an 'fi' ligature from separate 'f'
and 'i' glyphs. Another example is the contextual letterform changes that
happen in a language like Arabic.
Support for these changes is in the metamorphosis
table in GX, and in the 'GSUB' table in OpenType. Both systems allow the
selection of a set of features, and both allow dispatch via language (though
the GX approach here is somewhat clumsy; see the OpenType
advantages section). There are five different kinds of glyph identity
change that a line layout engine should support:
Noncontextual
A noncontextual change is a simple swash variant, where no context is
needed. These features are chosen by the user via one or more features,
or by larger-scope contexts such as the vertical orientation of the run
or line of text. Both GX and OpenType provide for noncontextual substitution,
including sensitivity to vertical orientation. A minor difference is that
OpenType defines ligature decomposition as happening here in the 'GSUB'
table, whereas GX puts ligature decomposition as happening later, when
justification occurs (and puts the ligature decomposition data into the
'just' table and not the 'mort' table).
Contextual
Both GX and OpenType permit the identification of certain contexts in
which glyph identity changes are to happen. There is a major difference,
however, in the way in which these are specified.
In GX, context is identified via a modified finite state machine. The
runs of text are processed, one glyph at a time, and as particular glyphs
are encountered, the state machine can enter different states, causing
different actions to occur. Thus, the same letter can have different changes
made to it if it in different contexts, for instance the start of the line
or in a particular position in a word. Note that the GX state engine differs
from the classical definition of a state machine in that it permits side
effects (i.e. changes to the input data).
In OpenType, context is identified via string matching. OpenType has
three contextual substitution formats:
- Matching a particular sequence of glyphs, like "abc". Any
string so matched can be replaced by another string.
- Similar to format 1, but instead of sequences of specific glyphs, this
allows for sequences of classes, where each class has one or more glyphs
as members. Glyphs are only permitted to belong to one class.
- Similar to format 2, but glyphs are permitted to belong to more than
one class.
In general, any of the three OpenType contextual formats can be converted
into equivalent GX state machines. However, the converse is not true: there
exist GX state machines which cannot be expressed as string matches. Mathematically,
this property derives from the fact that modifications are permitted by
GX's state machine, and so simple string matches can't be used.
For example, suppose a letter is to change into a particular alternate
form only if that form hasn't appeared more than twice previously on the
same line. Since this change might have to look all the way back to the
beginning of the line (even across runs in other fonts), there is no easy
way to express this in OpenType, though it's easy in GX. Other examples
are easy to come up with: any effect which relies on counting or remote
context earlier in the line is generally only expressable via state machines.
Ligature
Both GX and OpenType permit many-to-one mappings for creating ligatures.
As with the contextual actions, the major difference between the two formats
is that GX uses finite state machines to identify how and when the ligatures
form, while OpenType uses string matching. If you wish to create complex
Devanagari ligatures, for instance, which only form in certain contexts,
then GX makes this much easier than OpenType.
Insertion
GX provides for the insertion of arbitrary sequences of glyphs in locations
determined by a finite state machine. OpenType does not appear to offer
this capability. Any insertions that OpenType provides are handled by the
contextual mechanism listed above. The shortcomings of this scheme are
easily identified: if you wish to insert glyphs where there are none, then
there is no "matching string" available, and the OpenType method
breaks down. Similarly, if you wish to insert glyphs based on arbitrary
context later in the line, only GX lets you do this. An example of where
this capability is important is in writing systems where the vowels are
"split" into two separate places on the line. Another example
is automatic conversion of numbers into Roman numeral form.
Rearrangement
For historical reasons, rearrangement (a feature of South Asian scripts)
was added as one of the glyph identity changes to GX, even though it might
seem to be more related to glyph positioning. The reasons for this were
that rearrangement takes advantage of finite state machines, and thus it
was easier to drive rearrangement from the same engine that was driving
the other state machine changes.
There is not an explicit rearrangement action in OpenType. Some of this
functionality is subsumed in the contextual table, but arbitrary length
matches are difficult in OpenType, and easier in GX. An example of where
this is important is in the Devanagari rules for how the shape of the chota-i
changes when a "r" is present. We do this in our shipping GX
Indic fonts, but I don't easily see how it would be accomplished in OpenType.
[Back to Top] [Back to Technical]
Simple Positioning
Once the glyph identities have been changed, it is time to look at changing
the positions of the glyphs with respect to one another. These positioning
changes are usually expressed as deltas from the "natural" advances
of the glyphs. Further, unlike the plaintext assumption of only modifying
the glyphs' x-positions, a sophisticated positioning model allows control
over both x-position and y-position. This is required for scripts like
Urdu, where the delta-Y depends on the position of the letter in the word;
or in Tibetan, where vowels can stack on top of one another. Each of the
kinds of simple positioning are compared in the following sections.
[Back to Top] [Back to Technical]
Baseline Alignment
Both GX and OpenType provide for the automatic alignment to a dominant
baseline, via GX's 'bsln'
table and OpenType's 'BASE' table. This alignment can be to the hinted
position of a control point, or expressed in the font's own coordinate
system in both formats.
[Back to Top] [Back to Technical]
Kerning
The kerning table was defined in the original TrueType specification.
The GX version of the kerning
table has undergone several improvements since the original TrueType
release. While Microsoft supported the original version of the kerning
table, with OpenType they are now using the 'GPOS' table to specify kerning.
Presumably, a mechanism exists to convert old 'kern' data into the new
format for OpenType.
GX supports four kinds of kerning subtable:
- Simple pairwise kerning, which is closest to the old notion of "kerning
pairs";
- Class-based kerning, where glyphs which behave similarly can be grouped
together to save space in the kerning table;
- State-based kerning, where a full finite state machine can be used
to determine the kerning amount to be applied in a given context; and
- Compressed class-based, which is a version of the class-based kerning
table compressed to take even less space.
OpenType supports pairwise and class-based kerning. It also supports
a kind of contextual kerning, with the same limitations as those described
above in the discussion of contextual glyph substitution:
namely, the context must be defined in terms of string or string/class
matching.
Both formats support independent horizontal and vertical kerning.
Kerning is also used in another way in GX, and this is an area where
OpenType has a better answer: namely, accent attachment. The GX assumption
is that the font developer would provide the accented glyphs directly in
the font (there's even an accent
attachment table to make this as compact as possible). While you can
do contextual positioning using the GX kerning table in both X and Y, the
positioning is done only in distances, and not anchored to control points.
At small pixelsPerEm values, this can cause noticeable mispositionings.
OpenType's 'GPOS' table, on the other hand, allows full access to hinted
attachment points for the dynamic construction of accented forms.
[Back to Top] [Back to Technical]
Tracking
Unlike kerning, tracking affects letterspacing uniformly, without regard
to context. GX provides an explicit tracking
table, while OpenType subsumes this functionality into the 'GPOS' table.
The GX tracking deltas are functions of two variables: pointsize and specified
track (e.g. "tight" or "loose"), with two-dimensional
interpolation happening around tabular data when needed. There does not
appear to be a way to specify a track in OpenType.
[Back to Top] [Back to Technical]
Manual Letterspacing
This feature does not refer to any table, in either GX or OpenType.
In GX, manual letterspacing is specified in the options presented when
a line is requested to be laid out. In OpenType, the application is responsible
for all the processing anyway, so adding manual letterspacing adjustments
happens as part of that process.
[Back to Top] [Back to Technical]
Optical Alignment
Optical alignment is the fine-tuning of the positions of letters at
the margins. GX provides an explicit optical
alignment table, while OpenType provides for this function via the
'GPOS' table. In both cases, there is support for using hinted anchorpoints
or using XY distances.
[Back to Top] [Back to Technical]
Hanging Punctuation
A punctuation mark is said to "hang" when it appears completely
outside the margins. In GX, the glyph
properties table includes an indication of whether a particular glyph
is permitted to hang on the left and/or right sides (or the top and/or
bottom for vertical text). The only OpenType support I could find for this
is essentially the same support it has for optical alignment: data in the
'GPOS' table.
[Back to Top] [Back to Technical]
Ligature Carets
When the user selects text which includes ligatures, the software must
either prohibit the ligature from being subdivided, or it must permit selection
within the single ligature glyph. Both GX and OpenType provide for this
behavior, called ligature carets. GX defines the ligature
caret table, while OpenType includes this information in the 'GDEF'
or glyph definition table. Both formats permit this information to be specified
in either XY distances or as hinted control points.
[Back to Top] [Back to Technical]
Justification
Once the general positioning has happened, it is time to justify the
line (if desired). In GX this process is heavily controlled by the justification
table in the font; in OpenType, it is controlled by the 'JSTF' table.
The discussion in this section presumes that the reader is familiar with
the GX justification algorithm.
Fundamental to the GX model is the notion of applying factors in a fixed
priority, from highest (kashidas) to lowest (intercharacter). OpenType
has the notion of priorities as well, but instead of controlling the factors,
the priorities in OpenType control "justification suggestions"
which can include other kinds of action, like decomposing ligatures.
The difference here is one of model. The GX model determines factors,
applies them by priority to determine the gaps that need to be filled,
and then applies a separate pass (the so-called "postcompensation"
pass) to fill those gaps in a linguistically correct manner. OpenType fuses
these two separate passes into a single pass, where the highest priority
actions take place first, both in terms of factors and in terms of postcompensation-like
actions.
OpenType links the justification table back to the glyph positioning
table, and shares common lookup information, but contextual lookups are
not permitted in the justification table per se (the assumption being that
the context is determined by the 'GPOS' table, and the 'JSTF' table provides
extra information). GX, in contrast, allows a separate finite state machine
just for justification, separate from the ones for kerning and glyph identity
changes. As with the contextual changes, the kinds of context that can
be handled by a finite state machine are somewhat broader than those that
are handled by generic string matching.
Most of the types of postcompensation action supported by GX appear
to be supported by OpenType, including glyph substitution, kashida insertion,
and ligature decomposition. OpenType does not appear to have support for
glyph stretching or ductility, both of which GX supports. Glyph stretching,
if specified, causes a glyph to have a scale factor applied to it. Ductility
uses the font's variations mechanism (Multiple Masters for Type 1, or TrueType
variations for TrueType) to actually change the shape of the letter as
needed to fill the specified gap. This allows GX fonts to perform automatic
copyfitting, for example.
[Back to Top] [Back to Technical]
Device Positioning
Device positioning is one area where OpenType is clearly ahead of GX.
OpenType provides explicit support for "Device Tables" which
are referred to by the other OpenType layout tables. These device tables
include corrections for specific pixel-per-em values. Thus, a kerning adjustment
can be tweaked to an integral number of pixels.
The GX model, by contrast, treats most layout tables as living in ideal,
unhinted space; the only exceptions are the optical
bounds, baseline,
and ligature
caret tables, which can use hinted control points. This forces a different
layout model: layout happens in ideal space, and only after the whole line
is laid out are device tweaks accomplished. While this model works well
in some cases, there are other cases where doing the layout with device
metrics from the outset would probably produce better results.
[Back to Top] [Back to Technical]
Where GX has the advantage
The major advantage of GX is in the use of finite state machines to
match context, in all three steps of the layout process: glyph identity,
glyph positioning, and justification. While OpenType provides some contextual
capability, the discussions above point out ways
in which state machines are simply more powerful.
GX does not presume anything about the character codes it is given,
other than that they will be mapped into glyph codes for all the layout
processing. GX is very flexible about living with many simultaneous encodings
for the text is works with. By contrast, in many cases OpenType is hobbled
by its reliance on Unicode, and the presumption that the application will
do some processing (e.g. bidirectional reordering) before getting OpenType
layout involved.
[Back to Top]
Where OpenType has the advantage
There are some areas where OpenType's approach is more flexible than
GX's:
- Explicit support for language. In GX, you currently have to duplicate
glyphs and add extra 'cmap' subtables for the languages you want to use
to distinguish processing. OpenType identifies language as a fundamental
selector, used in all the OpenType tables.
- Support for control-point based attachment. The GX assumption has been
that the font designer would do this by creating actual glyphs with correctly
hinted components. However, it is also useful to imagine control-point
arbitrated kerning, or accent attachment.
[Back to Top]
The system support question
While not directly related to the font format per se, one of the principal
advantages of the GX approach is that a developer does not need to know
any of the details described in the previous sections. By making layout
a built-in part of the system, application developers are freed from the
tedium of having to know the formats of the various tables and processing
them. This advantage means that an application developer can write a single
version of the code which will work correctly with any font in any language,
since the details that usually complicate the text rendering process (such
as bidirectional flow, vertical vs. horizontal, complex hit-testing and
highlighting, etc.) are all handled automatically.
Contrast this with the signal weakness of OpenType: it forces the applications
to do all the work of processing its tables and doing the linguistic processing.
This leads to the situation where an application developer who doesn't
have linguistic expertise may not be able to ship an application in certain
worldwide markets, simply because there's no time to do all the detail
work. An engineer at Sun who works with Tibetan made the point this way:
[OpenType's] publicity has eclipsed its functionality. As far as complex
glyph substitution is concerned, it just doesn't cut it. It claims to support
it, but the reality is that it is a pathologically underspecified non-standard.
It provides a slot for data, that's it. There is no standard for the form
of the data, and more significantly, there is no requirement for a system's
font rendering machinery to provide transparent-to-the-application functionality
based on the data. If you want to use that data, an application developer
must collaborate with a font developer and add code to the application
to do the transformations required for that language.
Can I please see a show of hands of developers willing to spend a few
weeks to add support for Tibetan to their application? Oh, and a few more
weeks each for a few dozen other small languages? For free?
It ain't going to happen[...] The proper place for this type of software
is in the system software, where any application developer writing to a
generic API can have his application literally "run anywhere."
Apple has implemented this functionality splendidly with TrueType GX /WorldScript/
GX Line layout. Although it's not easy for a font developer to create the
required state tables, at least it's possible, and Apple has made tools
available to help.
Type technology evolves over time. New capabilities get added, which
means new table formats get defined. This is another way in which the GX
developer has a great advantage over an OpenType developer: no revision
of the application is required. The new capabilities just work. (This was
demonstrated when insertion actions were added in GX 1.1; the shipping
GX applications took advantage of the new capabilities without having to
do anything). In OpenType, however, when the font format changes, all applications
have to be revised, unless they are willing to not provide the new functionality.
Of course, if at some point system support is added for OpenType, this
argument will become less relevant.
[Back to Top]
Document History
- 8 May 1997: Created the first version
[Back to Top]
| |
Copyright©1998 by Apple Computer, Inc.
Updated 2/4/98 |
|
|