Legacy Documentclose button

Important: The information in this document is obsolete and should not be used for new development.

Previous Book Contents Book Index Next

Inside Macintosh: Text /
Chapter 6 - Script Manager / Script Manager Reference
Routines / Analyzing Characters


CharacterType

The CharacterType function returns a variety of information about the character represented by a given byte, including its type, class, orientation, direction, case, and size (in bytes).

FUNCTION CharacterType (textBuf: Ptr; textOffset: Integer; 
                        script: ScriptCode): Integer;
textBuf
A pointer to a text buffer containing the character to be examined.
textOffset

The offset to the location of the character to be examined. (It can be an offset to either the first or the second byte of a 2-byte character.) Offset is in bytes; the first byte of the first character has an offset of 0.
script
A value that specifies the script system the byte belongs to. Constants for all defined script codes are listed on page 6-52. To specify the font script, pass smCurrentScript in this parameter.
DESCRIPTION
The CharacterType return value is an integer bit field that provides information about the requested character. The field has the following format:
Bit rangeNameExplanation
0-3TypeCharacter types
4-7 (reserved)
8-11ClassCharacter classes ( = subtypes)
12OrientationHorizontal or vertical
13DirectionLeft or right[6]
14CaseUppercase or lowercase
15Size1-byte or 2-byte

The Script Manager defines the recognized character types, character classes, and character modifiers (bits 12-15), with constants to describe them. All of the constants are listed and described in the section "Getting Character-Type Information" beginning on page 6-28.

The Script Manager also defines a set of masks with which you can isolate each of the fields in the CharacterType return value. If you perform an AND operation with the CharacterType result and the mask for a particular field, you select only the bits in that field. Once you've done that, you can test the result, using the constants that represent the possible results.

The CharacterType field masks are the following:
MaskHex. valueExplanation
smcTypeMask$000FCharacter-type mask
smcReserved$00F0(reserved)
smcClassMask$0F00Character-class mask
smcOrientationMask$1000Character orientation (2-byte scripts)
smcRightMask$2000Writing direction (bidirectional scripts)
Main character set or subset (2-byte scripts)
smcUpperMask$4000Uppercase or lowercase
smcDoubleMask$8000Size (1 or 2 bytes)

The character type of the character in question is the result of performing an AND operation with smcTypeMask and the CharacterType result. Constants for the defined character types are listed on page 6-28.

The character class of the character in question is the result of performing an AND operation with smcClassMask and the CharacterType result. Character classes can be considered as subtypes of character types. Constants for the defined character classes are listed on page 6-29.

The orientation of the character in question is the result of performing an AND operation with smcOrientationMask and the CharacterType result. The orientation value can be either smCharHorizontal or smCharVertical.

The direction of the character in question is the result of performing an AND operation with smcRightMask and the CharacterType result. The direction value can be either smCharLeft (left-to-right) or smCharRight (right-to-left).

The case of the character in question is the result of performing an AND operation with smcUpperMask and the CharacterType result. The case value can be either smCharLower or smCharUpper.

The size of the character in question is the result of performing an AND operation with smcDoubleMask and the CharacterType result. The size value can be either smChar1byte or smChar2byte.

Note
CharacterType calls CharacterByteType to determine whether the byte at textOffset is a 1-byte character or the first byte or second byte of a 2-byte character. The larger the text buffer, the longer CharacterByteType takes to execute. To be most efficient, place the pointer textBuf at the beginning of the character of interest before calling CharacterType. (If you want to be compatible with older versions of CharacterType, also set textOffset to 1, rather than 0, for 2-byte characters.)
SPECIAL CONSIDERATIONS
CharacterType may move memory; your application should not call this function at interrupt time.

If you specify smCurrentScript for the script parameter, CharacterType always assumes that the text in the buffer belongs to the font script. It is unaffected by the state of the font force flag or the international resources selection flag.

For 1-byte script systems, the character-type tables are in the string-manipulation ('itl2') resource. For 2-byte script systems, they are in the encoding/rendering ('itl5') resource. If the appropriate resource does not include these tables, CharacterType exits without doing anything.

Some Roman fonts (for example, Symbol) substitute other characters for the standard characters in the Standard Roman character set. Since the Roman script system CharacterType function assumes the Standard Roman character set, it may return inappropriate results for nonstandard characters.

In versions of system software earlier than 7.0, the textOffset parameter to the CharacterType function must point to the second byte of a 2-byte character.

RESULT CODES
The complete set of CharacterType return values is found in the section "Getting Character-Type Information" beginning on page 6-28.


[6] In 2-byte script systems, bit 13 indicates whether or not the character is part of the main character set (not a user-defined character).

Previous Book Contents Book Index Next

© Apple Computer, Inc.
6 JUL 1996