Apple Developer Connection
Member Login Log In | Not a Member? Contact ADC

< Previous PageNext Page > Hide TOC

Unicode

Most character sets and character encoding schemes developed in the past are limited in that they supported just one language or a small set of languages. Multilingual software has traditionally had to implement methods for supporting and identifying multiple character encodings. To interpret a character encoded numerically, you needed to know the text encoding system used to encode the character. Because text encoding systems are not unique, the same numeric encoding used in different systems may not represent the same character. The adoption of Unicode has changed this.

Unicode is a character encoding system designed to support the interchange, processing, and display of all the written texts of the diverse languages of the modern world. Unicode supplies enough numeric values to encode all characters available to all written texts. It provides a single model for text display and editing. Unicode also simplifies the handling of bidirectional text and characters that change according to their position in the sentence.

Unicode has three primary formats available for encoding: UTF-8, UTF-16, and UTF-32 (UTF stand for Unicode Transformation Format). UTF-8 is a single-byte format; UTF-16 is a double-byte format; and UTF-32 is a quadruple-byte format. ATSUI uses UTF-16, which uses two bytes to specify a character. Text that uses an encoding other than Unicode can be converted to Unicode using the Text Encoding Converter. See Programming With the Text Encoding Conversion Manager for more information.

Because Unicode includes the character repertoires of most common character encodings, it facilitates data interchange with other platforms. Using Unicode, text manipulated by your application and shared across applications and platforms can be encoded in a single coded character set.

Unicode provides some special features, such as combining or nonspacing marks and conjoining Jamo. These features are a function of the variety of languages that Unicode handles. If you have coded applications that handle text for the languages these features support, they should be familiar to you. If you have used a single coded character set such as ASCII almost exclusively, these features will be new to you. ATSUI lets you control how the special features available through Unicode are rendered.

For more information on Unicode, see http://www.unicode.org. For additional details on how ATSUI implements the Unicode specification, see “ATSUI Implementation of the Unicode Specification.”



< Previous PageNext Page > Hide TOC


Last updated: 2007-07-10




Did this document help you?
Yes: Tell us what works for you.

It’s good, but: Report typos, inaccuracies, and so forth.

It wasn’t helpful: Tell us what would have helped.
Get information on Apple products.
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Copyright © 2007 Apple Inc.
All rights reserved. | Terms of use | Privacy Notice