Important: The information in this document is obsolete and should not be used for new development.
CharacterType
TheCharacterType
function returns a variety of information about the character represented by a given byte, including its type, class, orientation, direction, case, and size (in bytes).
FUNCTION CharacterType (textBuf: Ptr; textOffset: Integer; script: ScriptCode): Integer;
textBuf
- A
pointer to a text buffer containing the character to be examined.
textOffset
The offset to the location of the character to be examined. (It can be an offset to either the first or the second byte of a 2-byte character.) Offset is in bytes; the first byte of the first character has an offset of 0.script
- A value that specifies the script system the byte belongs to. Constants for all defined script codes are listed on page 6-52. To specify the font script, pass
smCurrentScript
in this parameter.DESCRIPTION
TheCharacterType
return value is an integer bit field that provides information about the requested character. The field has the following format:
Bit range Name Explanation 0-3 Type Character types 4-7 (reserved) 8-11 Class Character classes ( = subtypes) 12 Orientation Horizontal or vertical 13 Direction Left or right[6] 14 Case Uppercase or lowercase 15 Size 1-byte or 2-byte The Script Manager defines the recognized character types, character classes, and character modifiers (bits 12-15), with constants to describe them. All of the constants are listed and described in the section "Getting Character-Type Information" beginning on page 6-28.
The Script Manager also defines a set of masks with which you can isolate each of the fields in the
CharacterType
return value. If you perform anAND
operation with theCharacterType
result and the mask for a particular field, you select only the bits in that field. Once you've done that, you can test the result, using the constants that represent the possible results.The
CharacterType
field masks are the following:The character type of the character in question is the result of performing an
AND
operation withsmcTypeMask
and theCharacterType
result. Constants for the defined character types are listed on page 6-28.The character class of the character in question is the result of performing an
AND
operation withsmcClassMask
and theCharacterType
result. Character classes can be considered as subtypes of character types. Constants for the defined character classes are listed on page 6-29.The orientation of the character in question is the result of performing an
AND
operation withsmcOrientationMask
and theCharacterType
result. The orientation value can be eithersmCharHorizontal
orsmCharVertical
.The direction of the character in question is the result of performing an
AND
operation withsmcRightMask
and theCharacterType
result. The direction value can be either smCharLeft (left-to-right) or smCharRight (right-to-left).The case of the character in question is the result of performing an
AND
operation withsmcUpperMask
and theCharacterType
result. The case value can be either smCharLower or smCharUpper.The size of the character in question is the result of performing an
AND
operation withsmcDoubleMask
and theCharacterType
result. The size value can be either smChar1byte or smChar2byte.
- Note
CharacterType
callsCharacterByteType
to determine whether the byte attextOffset
is a 1-byte character or the first byte or second byte of a 2-byte character. The larger the text buffer, the longerCharacterByteType
takes to execute. To be most efficient, place the pointertextBuf
at the beginning of the character of interest before callingCharacterType
. (If you want to be compatible with older versions ofCharacterType
, also settextOffset
to 1, rather than 0, for 2-byte characters.)SPECIAL CONSIDERATIONS
CharacterType
may move memory; your application should not call this function at interrupt time.If you specify
smCurrentScript
for thescript
parameter,CharacterType
always assumes that the text in the buffer belongs to the font script. It is unaffected by the state of the font force flag or the international resources selection flag.For 1-byte script systems, the character-type tables are in the string-manipulation (
'itl2'
) resource. For 2-byte script systems, they are in the encoding/rendering ('itl5'
) resource. If the appropriate resource does not include these tables,CharacterType
exits without doing anything.Some Roman fonts (for example, Symbol) substitute other characters for the standard characters in the Standard Roman character set. Since the Roman script system
CharacterType
function assumes the Standard Roman character set, it may return inappropriate results for nonstandard characters.In versions of system software earlier than 7.0, the
textOffset
parameter to theCharacterType
function must point to the second byte of a 2-byte character.RESULT CODES
The complete set ofCharacterType
return values is found in the section "Getting Character-Type Information" beginning on page 6-28.
[6] In 2-byte script systems, bit 13 indicates whether or not the character is part of the main character set (not a user-defined character).