Retired Documents Library Developer
Search

Legacy Documentclose button

Important: The information in this document is obsolete and should not be used for new development.

Up Previous Next 

PATH 
Mac OS 8 and 9 Developer Documentation > Text Encoding Conversion Manager
Programming With the Text Encoding Conversion Manager



Character Encodings and Their Internet Names

Table C-1 lists character encodings for various languages, gives some of their common Internet names, and identifies the version of the Text Encoding Conversion Manager for which character encoding was first supported for use by the Text Encoding Converter and the Unicode Converter. In the last two columns of the table, "N/A" means that the encoding is not supported.

Table C-1  Character encoding Internet names and availability in Mac OS

Character encoding

Common Internet names

Related information

Version of Text Encoding Conversion Manager that first offered support in:

Text
Encoding
Converter

Unicode
Converter

Universal

Unicode 2.0 (16 bit) UTF-16 1.2 1.2
Unicode 2.0 UTF-8 UTF-8 1.2 1.2.1
Unicode 2.0 UTF-7 UTF-7 1.2 N/A
Unicode 1.1 (16-bit) UNICODE 1-1 1.2 1.2
Unicode 1.1 UTF-8 UNICODE-1-1-UTF-8 1.2 1.2.1
Unicode 1.1 UTF-7 UNICODE-1-1-UTF-7 1.2 N/A

Western European languages

ASCII US-ASCII 1.2.1 1.2.1
ISO 8859-1 (Latin-1) ISO-8859-1, latin1 1.2.1 1.2.1
ISO 8859-3 (Latin 3) ISO-8859-3 , latin3 1.5 1.5
ISO-8859-15 (Latin 9) ISO-8859-15, latin9 Latin-1 with EURO SIGN and CP 1252 letters 1.5 1.5
CP 1252 (Windows Latin-1) windows-1252, cp1252 ISO 8859-1, plus additions in C1 area 1.2 1.2
CP 437
(DOS Latin-US)
cp437 1.2 1.2
CP 850 (DOS Latin-1) cp850 1.4 1.4
Mac OS Roman mac, macintosh, x-mac-roman 1.2 1.2
Mac OS Icelandic x-mac-icelandic based on Mac OS Roman 1.2 1.2
Mac OS Latin-1,
Mac OS Mail
x-mac-latin1
(commonly sent as ISO-8859-1)
Mac OS Roman permuted to align with 8859-1 1.2 1.2
NextStep Latin 1.2 1.2
CP 037 (EBCDIC-US) cp037 ISO 8859-1 repertoire, different layout 1.2.1 1.2.1

Arabic

ISO 8859-6
(Latin/Arabic)
ISO-8859-6, arabic 1.2 1.2
CP 1256
(Windows Arabic)
windows-1256, cp1256 Partly 8859-6, plus C1 additions 1.2 1.2
CP 864 (DOS Arabic) cp864 Encodes Arabic presentation forms 1.2 1.2
Mac OS Arabic x-mac-arabic 1.2 1.2
Mac OS Farsi
x-mac-farsi 1.2 1.2

Central European languages

ISO 8859-2 (Latin-2) ISO-8859-2, latin2 1.2 1.2
ISO 8859-4 (Latin-4) ISO-8859-4, latin4 1.5 1.5
CP 1250 (Windows Latin-2) windows-1250, cp 1250 Partly 8859-2, plus C1 additions 1.2 1.2
CP 1257 (Windows BalticRim) windows-1257,cp 1257 1.5 1.5
Mac OS Central
European Roman
x-mac-centraleurroman 1.2 1.2
Mac OS Croatian x-mac-croatian Based on Mac OS Roman 1.2 1.2
Mac OS Romanian x-mac-romanian Based on Mac OS Roman 1.2 1.2

Chinese

GB 2312-80 1.2 N/A
EUC-CN GB2312, X-EUC-CN ASCII + GB 2312- 80 (8-bit) 1.2 1.2
CP 936
(DOS and Windows Simplified)
Similar to GBK 1.4 1.4
Mac OS
Chinese Simplified
Based on EUC-CN 1.2 1.2
ISO 2022-CN ("GB") ISO-2022-CN ASCII +
GB 2312-80 (7-bit)
(see RFC1922)
1.2 N/A
HZ HZ-GB-2312 ASCII + GB 2312-80 (7-bit) (see RFC1842); 1.2 N/A
GBK (extended GB) EUC-CN + Unihan repertoire (8-bit) 1.2 1.2
CNS 11643 plane 1 x-cns11643-1 N/A N/A
CNS 11643 plane 2 x-cns11643-2 N/A N/A
EUC-TW X-EUC-TW ASCII + CNS 11643-1992 (8-bit) 1.2 1.2
Big-5 Big5 (8-bit) 1.2 1.2
CP 950
(DOS and Windows Traditional)
Based on Big-5 1.4 1.4
Mac OS
Chinese Traditional
Based on Big-5 1.2 1.2
CCCII N/A N/A
EACC N/A N/A

Cyrillic

ISO 8859-5
(Latin/Cyrillic)
ISO-8859-5, cyrillic 1.2 1.2
KOI8-R KOI8-R See Rfc 1489 1.2 1.2
CP 1251
(Windows Cyrillic)
windows-1251, cp1251 Not based on ISO 8859-5 1.2 1.2
CP 866
(DOS Russian)
cp866 N/A N/A
Mac OS Cyrillic x-mac-cyrillic 1.2 1.2
Mac OS Ukrainian x-mac-ukrainian Mac OS Cyrillic with two replacements 1.2 1.2

Greek

ISO 8859-7 ISO-8859-7, greek 1.2 1.2
ISO 5428 ISO_5428:1980 N/A N/A
CP 1253
(Windows Greek)
windows-1253, cp1253 Nearly 8859-7, plus C1 additions 1.2 1.2
Mac OS Greek x-mac-greek 1.2 1.2
Greek CCITT greek-ccitt N/A N/A

Hebrew

ISO 8859-8
(Latin/Hebrew)
ISO-8859-8, hebrew 1.2 1.2
CP 1255
(Windows Hebrew)
windows-1255,cp1255 Mostly 8859-8, plus C1 additions 1.2 1.2
Mac OS Hebrew
(2 variants)
x-mac-hebrew 1.2 1.2

Indic

ISCII-91 Parallel encodings for all Indic scripts N/A N/A
Mac OS Gujarati 1.2 1.2
Mac OS Devanagari 1.2 1.2
Mac OS Gurmukhi 1.2 1.2

Japanese

JIS X0208 1.2 N/A
JIS X0212 N/A N/A
EUC-JP EUC-JP, X-EUC-JP JIS 201 + JIS 208 + JIS 212 (8-bit) 1.2 1.4
ISO 2022-JP ("JIS") ISO-2022-JP JIS 201 + JIS 208 + JIS 212 (7-bit); Rfc 1468 1.2 N/A
Shift-JIS Shift_JIS, x-sjis, x-shift-jis JIS 201 + JIS 208 (8-bit) 1.2 1.2
CP 932
(DOS + Windows)
Based on Shift-JIS 1.4 1.4
Mac OS Japanese Based on Shift-JIS 1.2 1.2

Korean

KSC 5601-1987 1.2 N/A
EUC-KR EUC-KR ASCII + KSC 5601-87 (8-bit); Rfc 1557 1.2 1.2
CP 949
(DOS + Windows)
Unified Hangul Code: EUC-KR + Johab N/A N/A
Mac OS Korean Based on EUC-KR 1.2 1.2
ISO 2022-KR ("KSC") ISO-2022-KR ASCII + KSC 5601-87 (7-bit): Rfc 1557 1.2 N/A
KSC 5700 N/A N/A

Symbols encoding

Adobe Symbol Adobe-Symbol-Encoding N/A N/A
Mac OS Symbol x-mac-symbol Based on Adobe Symbol 1.2 1.2
Mac OS dingbats x-mac-dingbats Based on Adobe Zapf Dingbats 1.2 1.2

Thai

TIS 620-2533 N/A N/A
CP 874
(DOS + Windows)
cp874 Based on TIS 620-2533 1.4 1.4
Mac OS Thai x-mac-thai Based on TIS 620-2533 1.2 1.2

Turkish

ISO 8859-9 (Latin-5) ISO-8859, latin5 1.2 1.2
ISO 8859-3 (Latin-3) ISO-8859-3 N/A N/A
CP 1254
(Windows Latin-5)
windows-1254, cp1254 1.2 1.2
Mac OS Turkish x-mac-turkish Based on Mac OS Roman 1.2 1.2

Vietnamese

VISCII VISCII Rfc 1456 N/A N/A
TCVN-n N/A N/A
CP 1258 (Windows Vietnamese) windows-1258, cp1258 1.5 1.5


© 1999 Apple Computer, Inc. – (Last Updated 13 Dec 99)

Up Previous Next