Legacy Documentclose button

Important: The information in this document is obsolete and should not be used for new development.

Previous Book Contents Book Index Next

Inside Macintosh: Text /
Appendix A - Built-in Script Support / The Roman Script System


The Standard Roman Character Set

The Standard Roman character set is an extended version of the Macintosh character set, documented in Volume I of the original Inside Macintosh. The Macintosh character set is itself an extended version of the ASCII character set. The conventional ASCII character set, also called low ASCII, defines control codes, symbols, numbers, and letters, assigning them character codes from $00 through $7F. The Macintosh character set adds codes from $80 through $D8, representing accented characters and additional symbols. Current Macintosh file-system sorting, as well as the sorting order used by several Text Utilities routines such as RelString, is based on the Macintosh character set.

The Standard Roman character set adds more accented forms, symbols, and diacritical marks, assigning them character codes from $D9 through $FF. It thus consists of all the character codes from $00-$FF, and it includes uppercase versions of all of the lowercase accented forms, a number of symbols, and other forms. See Figure A-1.

The Standard Roman character set is the closest to a universal character encoding that exists in the Roman script system. The Standard Roman characters are available in most Roman outline fonts, but not all are available in the Apple bitmapped versions of Chicago, Geneva, New York, and Monaco.

Figure A-1 The Standard Roman character set

Nonprinting Characters

Table A-1 lists the nonprinting characters in the Standard Roman character set. The Unicode 1.0 name and the Macintosh character code (in hexadecimal and decimal) are provided also. (Unicode is an ISO standard for 16-bit universal worldwide
character encoding.)
Table A-1 Nonprinting characters in the Standard Roman character set (Continued)
Unicode nameHexadecimalDecimal 
NULL$000 
START OF HEADING$011 
START OF TEXT$022 
END OF TEXT$033 
END OF TRANSMISSION$044 
ENQUIRY$055 
ACKNOWLEDGE$066 
BELL$077 
BACKSPACE$088 
HORIZONTAL TABULATION$099 
LINE FEED$0A10 
VERTICAL TABULATION$0B11 
FORM FEED$0C12 
CARRIAGE RETURN$0D13 
SHIFT OUT$0E14 
SHIFT IN$0F15 
DATA LINK ESCAPE$1016 
DEVICE CONTROL ONE$1117 
DEVICE CONTROL TWO$1218 
DEVICE CONTROL THREE$1319 
DEVICE CONTROL FOUR$1420 
NEGATIVE ACKNOWLEDGE$1521 
SYNCHRONOUS IDLE$1622 
END OF TRANSMISSION BLOCK$1723 
CANCEL$1824 
END OF MEDIUM$1925 
SUBSTITUTE$1A26 
ESCAPE$1B27 
FILE SEPARATOR$1C28 
GROUP SEPARATOR$1D29 
RECORD SEPARATOR$1E30 
UNIT SEPARATOR$1F31 
DELETE$7F127 

Using Roman Character Codes as Delimiters

Your application may need to use a character code or range of codes to represent noncharacter data (such as field delimiters). Character codes below $20 are never affected by the script system. Some of these character codes can be used safely for special purposes. Note, however, that most characters in this range are already assigned special meanings by parts of Macintosh system software, such as TextEdit, or by programming languages like C. Table A-2 lists the low-ASCII characters to avoid in your application.
Table A-2 Low-ASCII characters to avoid as delimiters (Continued)
CharacterHexadecimal representation 
Null$00 
Home$01 
Enter$03 
End$04 
Help$05 
Backspace$08 
Tab$09 
 
Page up$0B 
Page down$0C 
Carriage return$0D 
F1 through F15$10 
System characters$11, $12, $13, $14[7] 
Clear$1B 
Arrow keys$1C, $1D, $1E, $1F 

For certain writing systems, font layouts (tables that map glyph codes to glyphs) may use some of these character codes internally, for ligatures or other contextual forms. Also, as noted in Table A-2, system fonts use codes $11 through $14 for printing special symbols such as the Apple logo. Thus in unusual situations font changes may have an impact on the glyph representation of stored character codes with values less than $20, even though a user cannot generate those codes directly.

Printing Characters

Table A-3 shows the printing characters that exist in the Standard Roman character set. Macintosh applications can assume that glyphs for these characters exist in every Roman font. (However, see also the discussion of Roman fonts on page A-18.) The Unicode 1.0 and PostScript names and Macintosh character code (in hexadecimal and decimal) are provided along with a glyph example for printable characters. Modified versions of the Standard Roman character set exist for Croatian, Romanian, Turkish, and Icelandic/Faroese, with different character assignments for the same codes. See Table A-4 through Table A-7. &
Table A-3 Printing characters in the Standard Roman character set (Continued)
GlyphUnicode namePostScript nameHexadecimalDecimal
 SPACEspace$2032
!EXCLAMATION MARKexclam$2133
"QUOTATION MARKquotedbl$2234
#NUMBER SIGNnumbersign$2335
$DOLLAR SIGNdollar$2436
%PERCENT SIGNpercent$2537
&AMPERSANDampersand$2638
'APOSTROPHE-QUOTEquotesingle$2739
(OPENING PARENTHESISparenleft$2840
)CLOSING PARENTHESISparenright$2941
*ASTERISKasterisk$2A42
+PLUS SIGNplus$2B43
,COMMAcomma$2C44
-HYPHEN-MINUShyphen$2D45
.PERIODperiod$2E46
/SLASHslash$2F47
0DIGIT ZEROzero$3048
1DIGIT ONEone$3149
2DIGIT TWOtwo$3250
3DIGIT THREEthree$3351
4DIGIT FOURfour$3452
5DIGIT FIVEfive$3553
6DIGIT SIXsix$3654
7DIGIT SEVENseven$3755
8DIGIT EIGHTeight$3856
9DIGIT NINEnine$3957
:COLONcolon$3A58
;SEMICOLONsemicolon$3B59
<LESS-THAN SIGNless$3C60
=EQUALS SIGNequal$3D61
 
>GREATER-THAN SIGNgreater$3E62
?QUESTION MARKquestion$3F63
@COMMERCIAL ATat$4064
ALATIN CAPITAL LETTER AA$4165
BLATIN CAPITAL LETTER BB$4266
CLATIN CAPITAL LETTER CC$4367
DLATIN CAPITAL LETTER DD$4468
ELATIN CAPITAL LETTER EE$4569
FLATIN CAPITAL LETTER FF$4670
GLATIN CAPITAL LETTER GG$4771
HLATIN CAPITAL LETTER HH$4872
ILATIN CAPITAL LETTER II$4973
JLATIN CAPITAL LETTER JJ$4A74
KLATIN CAPITAL LETTER KK$4B75
LLATIN CAPITAL LETTER LL$4C76
MLATIN CAPITAL LETTER MM$4D77
NLATIN CAPITAL LETTER NN$4E78
OLATIN CAPITAL LETTER OO$4F79
PLATIN CAPITAL LETTER PP$5080
QLATIN CAPITAL LETTER QQ$5181
RLATIN CAPITAL LETTER RR$5282
SLATIN CAPITAL LETTER SS$5383
TLATIN CAPITAL LETTER TT$5484
ULATIN CAPITAL LETTER UU$5585
VLATIN CAPITAL LETTER VV$5686
WLATIN CAPITAL LETTER WW$5787
XLATIN CAPITAL LETTER XX$5888
YLATIN CAPITAL LETTER YY$5989
ZLATIN CAPITAL LETTER ZZ$5A90
[OPENING SQUARE BRACKETbracketleft$5B91
\BACK SLASHbackslash$5C92
]CLOSING SQUARE BRACKETbracketright$5D93
^SPACING CIRCUMFLEXasciicircum$5E94
_SPACING UNDERSCOREunderscore$5F95
`SPACING GRAVEgrave$6096
aLATIN SMALL LETTER Aa$6197
bLATIN SMALL LETTER Bb$6298
cLATIN SMALL LETTER Cc$6399
dLATIN SMALL LETTER Dd$64100
eLATIN SMALL LETTER Ee$65101
fLATIN SMALL LETTER Ff$66102
gLATIN SMALL LETTER Gg$67103
hLATIN SMALL LETTER Hh$68104
iLATIN SMALL LETTER Ii$69105
jLATIN SMALL LETTER Jj$6A106
kLATIN SMALL LETTER Kk$6B107
lLATIN SMALL LETTER Ll$6C108
mLATIN SMALL LETTER Mm$6D109
nLATIN SMALL LETTER Nn$6E110
oLATIN SMALL LETTER Oo$6F111
pLATIN SMALL LETTER Pp$70112
qLATIN SMALL LETTER Qq$71113
rLATIN SMALL LETTER Rr$72114
sLATIN SMALL LETTER Ss$73115
tLATIN SMALL LETTER Tt$74116
uLATIN SMALL LETTER Uu$75117
vLATIN SMALL LETTER Vv$76118
wLATIN SMALL LETTER Ww$77119
xLATIN SMALL LETTER Xx$78120
yLATIN SMALL LETTER Yy$79121
zLATIN SMALL LETTER Zz$7A122
 
{OPENING CURLY BRACKETbraceleft$7B123
|VERTICAL BARbar$7C124
}CLOSING CURLY BRACKETbraceright$7D125
~TILDEasciitilde$7E126
 DELETE (nonprinting) $7F127
ÄLATIN CAPITAL LETTER A DIAERESISAdieresis$80128
ÅLATIN CAPITAL LETTER A RINGAring$81129
ÇLATIN CAPITAL LETTER C CEDILLACcedilla$82130
ÉLATIN CAPITAL LETTER E ACUTEEacute$83131
ÑLATIN CAPITAL LETTER N TILDENtilde$84132
ÖLATIN CAPITAL LETTER O DIAERESISOdieresis$85133
ÜLATIN CAPITAL LETTER U DIAERESISUdieresis$86134
áLATIN SMALL LETTER A ACUTEaacute$87135
àLATIN SMALL LETTER A GRAVEagrave$88136
âLATIN SMALL LETTER A CIRCUMFLEXacircumflex$89137
äLATIN SMALL LETTER A DIAERESISadieresis$8A138
ãLATIN SMALL LETTER A TILDEatilde$8B139
åLATIN SMALL LETTER A RINGaring$8C140
çLATIN SMALL LETTER C CEDILLAccedilla$8D141
éLATIN SMALL LETTER E ACUTEeacute$8E142
èLATIN SMALL LETTER E GRAVEegrave$8F143
êLATIN SMALL LETTER E CIRCUMFLEXecircumflex$90144
ëLATIN SMALL LETTER E DIAERESISedieresis$91145
íLATIN SMALL LETTER I ACUTEiacute$92146
ìLATIN SMALL LETTER I GRAVEigrave$93147
îLATIN SMALL LETTER I CIRCUMFLEXicircumflex$94148
ïLATIN SMALL LETTER I DIAERESISidiaresis$95149
ñLATIN SMALL LETTER N TILDEntilde$96150
óLATIN SMALL LETTER O ACUTEoacute$97151
òLATIN SMALL LETTER O GRAVEograve$98152
ôLATIN SMALL LETTER O CIRCUMFLEXocircumflex$99153
öLATIN SMALL LETTER O DIAERESISodieresis$9A154
õLATIN SMALL LETTER O TILDEotilde$9B155
úLATIN SMALL LETTER U ACUTEuacute$9C156
ùLATIN SMALL LETTER U GRAVEugrave$9D157
ûLATIN SMALL LETTER U CIRCUMFLEXucircumflex$9E158
üLATIN SMALL LETTER U DIAERESISudieresis$9F159
DAGGERdagger$A0160
šDEGREE SIGNdegree$A1161
¢CENT SIGNcent$A2162
£POUND SIGNsterling$A3163
§SECTION SIGNsection$A4164
BULLETbullet$A5165
PARAGRAPH SIGNparagraph$A6166
ßLATIN SMALL LETTER SHARP Sgermandbls$A7167
®REGISTERED TRADEMARK SIGNregistered$A8168
©COPYRIGHT SIGNcopyright$A9169
TRADEMARKtrademark$AA170
SPACING ACUTEacute$AB171
¨SPACING DIAERESISdieresis$AC172
NOT EQUAL TOnotequal$AD173
ÆLATIN CAPITAL LETTER AEAE$AE174
ØLATIN CAPITAL LETTER O SLASHOslash$AF175
INFINITYinfinity$B0176
±PLUS-OR-MINUS SIGNplusminus$B1177
¾LESS THAN OR EQUAL TOlessequal$B2178
GREATER THAN OR EQUAL TOgreaterequal$B3179
¥YEN SIGNyen$B4180
µMICRO SIGNmu$B5181
PARTIAL DIFFERENTIALpartialdiff$B6182
 
N-ARY SUMMATIONsummation$B7183
½N-ARY PRODUCTproduct$B8184
¼GREEK SMALL LETTER PIpi$B9185
INTEGRALintegral$BA186
ªFEMININE ORDINAL INDICATORordfeminine$BB187
MASCULINE ORDINAL INDICATORordmasculine$BC188
OHMOmega$BD189
æLATIN SMALL LETTER AEae$BE190
øLATIN SMALL LETTER O SLASHoslash$BF191
¿INVERTED QUESTION MARKquestiondown$C0192
INVERTED EXCLAMATION MARKexclamdown$C1193
¬NOT SIGNlogicalnot$C2194
SQUARE ROOTradical$C3195
LATIN SMALL LETTER SCRIPT Fflorin$C4196
ALMOST EQUAL TOapproxequal$C5197
INCREMENTDelta$C6198
«LEFT POINTING GUILLEMETguillemotleft$C7199
»RIGHT POINTING GUILLEMETguillemotright$C8200
ÉHORIZONTAL ELLIPSISellipsis$C9201
 NON-BREAKING SPACE $CA202
ÀLATIN CAPITAL LETTER A GRAVEAgrave$CB203
ÃLATIN CAPITAL LETTER A TILDEAtilde$CC204
ÕLATIN CAPITAL LETTER O TILDEOtilde$CD205
ŒLATIN CAPITAL LETTER O EOE$CE206
»LATIN SMALL LETTER O Eoe$CF207
EN DASHendash$D0208
EM DASHemdash$D1209
DOUBLE TURNED COMMA QUOTATION MARKquotedblleft$D2210
DOUBLE COMMA QUOTATION MARKquotedblright$D3211
SINGLE TURNED COMMA QUOTATION MARKquoteleft$D4212
ÕSINGLE COMMA QUOTATION MARKquoteright$D5213
÷DIVISION SIGNdivide$D6214
LOZENGElozenge$D7215
LATIN SMALL LETTER Y DIAERESISydieresis$D8216
LATIN CAPITAL LETTER Y DIAERESISYdieresis$D9217
/FRACTION SLASHfraction$DA218
¤CURRENCY SIGNcurrency$DB219
<LEFT POINTING SINGLE GUILLEMETguilsingleleft$DC220
>RIGHT POINTING SINGLE GUILLEMETguilsingleright$DD221
fi(no Unicode designation)fi$DE222
fl(no Unicode designation)fl$DF223
DOUBLE DAGGERdaggerdbl$E0224
.MIDDLE DOTperiodcentered$E1225
,LOW SINGLE COMMA QUOTATION MARKquotesinglbase$E2226
,,LOW DOUBLE COMMA QUOTATION MARKquotedblbase$E3227
PER MILLE SIGNperthousand$E4228
ÂLATIN CAPITAL LETTER A CIRCUMFLEXAcircumflex$E5229
ÊLATIN CAPITAL LETTER E CIRCUMFLEXEcircumflex$E6230
ÁLATIN CAPITAL LETTER A ACUTEAacute$E7231
ËLATIN CAPITAL LETTER E DIAERESISEdieresis$E8232
ÈLATIN CAPITAL LETTER E GRAVEEgrave$E9233
ÍLATIN CAPITAL LETTER I ACUTEIacute$EA234
ÎLATIN CAPITAL LETTER I CIRCUMFLEXIcircumflex$EB235
ÏLATIN CAPITAL LETTER I DIAERESISIdieresis$EC236
ÌLATIN CAPITAL LETTER I GRAVEIgrave$ED237
ÓLATIN CAPITAL LETTER O ACUTEOacute$EE238
ÔLATIN CAPITAL LETTER O CIRCUMFLEXOcircumflex$EF239
APPLE LOGOApple$F0240
ÒLATIN CAPITAL LETTER O GRAVEOgrave$F1241
ÚLATIN CAPITAL LETTER U ACUTEUacute$F2242
ÛLATIN CAPITAL LETTER U CIRCUMFLEXUcircumflex$F3243
 
ÙLATIN CAPITAL LETTER U GRAVEUgrave$F4244
žLATIN SMALL LETTER DOTLESS Idotlessi$F5245
^MODIFIER LETTER CIRCUMFLEXcircumflex$F6246
~SPACING TILDEtilde$F7247
-SPACING MACRONmacron$F8248
SPACING BREVEbreve$F9249
·SPACING DOT ABOVEdotaccent$FA250
°SPACING RING ABOVEring$FB251
¸SPACING CEDILLAcedilla$FC252
"SPACING DOUBLE ACUTEhungarumlaut$FD253
SPACING OGONEKogonek$FE254
MODIFIER LETTER HACEKcaron$FF255

Variations in the Character Set

Two types of variations from the Standard Roman character set can occur. First, several languages and regional variations of Roman reassign parts of the character set; second, many specialized Roman fonts completely override the character set to provide other types of symbols.

Table A-4 shows the glyph assignments in the Croatian version of the Roman character set that diverge from the Standard Roman character set, their Unicode and PostScript names, and their Macintosh character codes in hexadecimal and decimal. For example, the code (hexadecimal $A9) that is assigned to the copyright sign in the Standard Roman character set is replaced by the Scaron (that is, the Roman capital letter "S" with a hacek). The copyright sign appears later at position $D9, which is assigned to the Latin capital letter "Y" diaeresis in the Standard Roman character set.
Table A-4 Croatian variations from the Standard Roman character set (Continued)
GlyphUnicode namePostScript nameHexadecimalDecimal
LATIN CAPITAL LETTER S HACEKScaron$A9169
ZLATIN CAPITAL LETTER Z HACEKZcaron$AE174
INCREMENTDelta$B4180
LATIN SMALL LETTER S HACEKscaron$B9185
zLATIN SMALL LETTER Z HACEKzcaron$BE190
BLATIN CAPITAL LETTER C ACUTECacute$C6198
CLATIN CAPITAL LETTER C HACEKCcaron$C8200
DLATIN CAPITAL LETTER D BARDmacron$D0208
APPLE LOGOapple$D8216
©COPYRIGHT SIGNcopyright$D9217
®LATIN CAPITAL LETTER A EAE$DE222
RIGHT POINTING GUILLEMETguillemotright$DF223
-EN DASHendash$E0224
cLATIN SMALL LETTER C ACUTEcacute$E6230
cLATIN SMALL LETTER C HACEKccaron$E8232
dLATIN SMALL LETTER D BARdmacron$F0240
GREEK SMALL LETTER PIpi$F9249
ËLATIN CAPITAL LETTER E DIAERESISEdieresis$FA250
ÊLATIN CAPITAL LETTER E CIRCUMFLEXEcircumflex$FD253
æLATIN SMALL LETTER A Eae$FE254

Table A-5 shows the glyph assignments in the Romanian version of the Roman character set that diverge from the Standard Roman character set, their Unicode and PostScript names, and their Macintosh character codes in hexadecimal and decimal.
Table A-5 Romanian variations from the Standard Roman character set
GlyphUnicode namePostScript nameHexadecimalDecimal
ALATIN CAPITAL LETTER A BREVEAbreve$AE174
SLATIN CAPITAL LETTER S CEDILLA
(COMMA VARIANT)
Scedilla$AF175
aLATIN SMALL LETTER A BREVEabreve$BE190
sLATIN SMALL LETTER S CEDILLA
(COMMA VARIANT)
scedilla$BF191
TLATIN CAPITAL LETTER T CEDILLA
(COMMA VARIANT)
Tcedilla$DE222
tLATIN SMALL LETTER T CEDILLA
(COMMA VARIANT)
tcedilla$DF223

Table A-6 shows the glyph assignments in the Turkish version of the Roman character set that diverge from the Standard Roman character set, their Unicode and PostScript names, and their Macintosh character codes in hexadecimal and decimal.
Table A-6 Turkish variations from the Standard Roman character set
GlyphUnicode namePostScript nameHexadecimalDecimal
GLATIN CAPITAL LETTER G BREVEGbreve$DA218
gLATIN SMALL LETTER G BREVEgbreve$DB219
ILATIN CAPITAL LETTER I DOTIdot$DC220
iLATIN SMALL LETTER DOTLESS Idotlessi$DD221
SLATIN CAPITAL LETTER S CEDILLAScedilla$DE222
sLATIN SMALL LETTER S CEDILLAscedilla$DF223

Table A-7 shows the glyph assignments in the Icelandic and Faroese versions of the Roman character set that diverge from the Standard Roman character set, their Unicode and PostScript names, and their Macintosh character codes in hexadecimal and decimal.
Table A-7 Icelandic and Faroese variations from the Standard Roman character set
GlyphUnicode namePostScript nameHexadecimalDecimal
YLATIN CAPITAL LETTER Y ACUTEYacute$A0160
DLATIN CAPITAL LETTER ETHEth$DC220
dLATIN SMALL LETTER ETHeth$DD221
PLATIN CAPITAL LETTER THORNThorn$DE222
pLATIN SMALL LETTER THORNthorn$DF223
yLATIN SMALL LETTER Y ACUTEyacute$E0224

In addition to regional variations in the character set, the Roman script system in particular contains many fonts with unique glyphs. Since the character encoding is limited to 256 values, specialized fonts such as Symbol and ITC Zapf Dingbats override the Standard Roman character encoding. For example, in the Standard Roman character set $70 corresponds to lowercase "p", but it is the numeric symbol for pi (¼) in the Symbol font, an outlined square () in Zapf Dingbats, and the musical symbol pianissimo for play quietly in the Sonata font. Hence, there is no guarantee that a Roman character code will always represent the same character in every font.


[7] System fonts use these codes for the printing characters PROPELLER, LOZENGE , RADICAL, and APPLE LOGO, respectively.

Previous Book Contents Book Index Next

© Apple Computer, Inc.
6 JUL 1996