Legacy Documentclose button

Important: The information in this document is obsolete and should not be used for new development.

Previous Book Contents Book Index Next

Inside Macintosh: Text /
Appendix B - International Resources / String-Manipulation Resource (Type 'itl2')


The 'itl2' Tables

The following tables in the string-manipulation resource define character and word features for processing strings.

Script Run Table Format

The script run table is used by the Text Utilities FindScriptRun function. FindScriptRun locates runs of text that belong to a subscript, such as Roman, within a single script run. The FindScriptRun function is described in the chapter "Text Utilities" in this book.

There are two formats of script run table. The original format, used in versions of system software earlier than 7.1, consists of a series of byte pairs with the format character code, script code. The character code is the final character code in a range of characters that belongs to the script specified by the script code. (The table contains only final character codes; the initial character code of a range is assumed to be one greater than the final character code in the previous range--or 0 for the first range.) The last pair must
have character code $FF. For example, if the character set encoding for script smSample were defined such that $00-7F and $A0 were Roman characters and the remaining characters were native characters in smSample, the table would appear as follows:
Character
code
Script code
$7FsmRoman
$9FsmSample
$A0smRoman
$FFsmSample

This simple format is appropriate for script systems whose text can be separated into Roman or native characters based purely upon character code, and for which other subscript information (returned in the variant field of the ScriptRunStatus record by FindScriptRun) is always 0. For 2-byte script systems, or when the same character could be designated as either Roman or native (depending on its context), this simple format is insufficient.

The newer format for the script run table is used in versions of system software starting with 7.1. It consists of a header, a state table, and a set of associated tables, similar in structure to the word-break table of type NBreakTable (described on page B-44). It is more flexible than the old format: for example, it can consider punctuation marks such as the period (ASCII code $2E) to be either to Roman or non-Roman, depending upon whether they are associated with Roman or non-Roman characters in the text. The script run table format is shown in Figure B-5.

Figure B-5 Format of the script run table header (new format)

The table header has these elements:

The header is immediately followed by the data of the class table, auxiliary class table, state table, and return table. The tables have this format and content:

The state table is shown in Figure B-6. The table begins with a list of words containing byte offsets from the beginning of the state table to the rows of the state table; this
is followed by a c-by-s byte array, where c is the number of classes (columns) and s is
the number of states (rows). The bytes in this array are stored with the column index varying most rapidly--that is, the bytes for the state 0 row precede the bytes for
the state 1 row. There is a maximum of 128 classes and 64 states (including the start
and exit states).

Figure B-6 Script run table state table

Each entry in this array is an action code, which specifies

The format of an action code is shown in Figure B-7.

Figure B-7 Format of a script run table action code

The return table is a list of script code-variant pairs, as shown in Figure B-8. The table lists possible return values for the FindScriptRun function. Each pair in the table is a ScriptRunStatus record, as described in the chapter "Text Utilities" in this book. The variant associated with each script code gives subscript information for 2-byte script systems. When FindScriptRun exits the state table, it has encountered a subscript boundary; it uses the exit code to index into the return table and determine the script code of the subscript run it has just exited from.

Figure B-8 Format of the script run table return table


Previous Book Contents Book Index Next

© Apple Computer, Inc.
6 JUL 1996