Important: The information in this document is obsolete and should not be used for new development.
Analyzing Characters
This section describes the functionsCharacterByteType
,CharacterType
, andFillParseTable
, which give you information about a character or group of characters, specified by character code:
The script system associated with the character you wish to examine must be enabled in order for any of these three routines to provide useful information. For example, if only the Roman script system is available and you attempt to identify a byte in a run of 2-byte characters, the
- The
CharacterByteType
function identifies a byte in a text buffer as a 1-byte character or as the first or second byte of a 2-byte character.- The
CharacterType
function returns specific information about the character at a particular byte offset.- The
FillParseTable
function fills a 256-byte table that indicates, for each possible byte value, whether it is the first byte of a 2-byte character.
CharacterByteType
function returns 0, indicating that the byte is a 1-byte character.
- 1-byte script systems
- For 1-byte script systems, the character-type tables reside in the string-manipulation (
'itl2'
) resource and reflect region-specific or language-specific differences in uppercase conventions. TheCharacterType
function gets the tables from the string-manipulation resource using theGetIntlResource
function.- 2-byte script systems
- For 2-byte script systems, the character-type tables reside in the encoding/rendering (
'itl5'
) resource, not the string-manipulation resource. Whenever you callCharacterByteType
,CharacterType
, orFillParseTable
, the necessary character-set encoding information is taken from the encoding/rendering resource. You cannot use theGetIntlResource
function to access 2-byte character-type
tables directly.
Subtopics
- CharacterByteType
- CharacterType
- FillParseTable