Language and Locale Designations

OS X and iOS support existing and forthcoming International Organization for Standardization (ISO) standards for the identification of languages and locales. Specifically, they support the language and locale codes that are defined by the BCP 47 specification. These codes are used in the naming of language-specific project directories and in other places where language and locale information is needed.

Using the available conventions, you can distinguish between different languages and between different regional dialects of a single language. The following sections show you how to specify this information in your code.

Language Designations

For language designations, you can use either the ISO 639-1 or ISO 639-2 conventions. The ISO 639-1 specification uses a two-letter code to identify a language and is the preferred way to identify languages. However, if an ISO 639-1 code is not available for a particular language, you may use the three-letter designators defined by the ISO 639-2 specification instead. Table 1 lists ISO designators for a subset of languages. Note that there is no ISO 639-1 designator for Hawaiian and so you must use the ISO 639-2 designator.

Table 1  Examples of ISO language designations

Language

ISO 639-1

ISO 639-2

English

en

eng

French

fr

fre

German

de

ger

Japanese

ja

jpn

Hawaiian

no designator

haw

For a complete list of ISO 639-1 and ISO 639-2 codes, go to http://www.loc.gov/standards/iso639-2/php/English_list.php.

Regional Designations

For regional designations, you can use the ISO 3166-1 conventions. This specification uses a two-letter, capitalized code to identify a specific country. By concatenating a language designator with an underscore character and a regional designator, you get a designator that identifies the locale for a specific language and country. Table 2 lists the regional designators for a subset of languages and countries.

Table 2  Examples of ISO regional designators

Regional dialect

Region Designator

English (United States)

US

English (Great Britain)

GB

English (Australian)

AU

French (France)

FR

French (Canadian)

CA

For a complete list of ISO 3166-1 codes, go to http://www.iso.ch.

Language and Locale IDs

A language ID designates a written language (or orthography) and can reflect either the generic language or a specific dialect of that language. To specify a language ID, you use a language designator by itself. To specify a specific dialect of a language, you use a hyphen to combine a language designator with a region designator. Thus, the English language as it is spoken in Great Britain would yield a language ID of en-GB, while the English language spoken in the United States would have a language ID of en-US. To specify the generic version of the English language, you would use the language ID en by itself.

A locale ID identifies a specific location where a given language is spoken. To specify a locale ID, use an underscore character to combine a language designator with a region designator. The locale ID for English-language speakers in Great Britain is en_GB, while the locale for English-speaking residents of the United States is en_US. Although locale IDs and language IDs might seem nearly identical, there is a subtle difference. A language ID identifies a written and spoken language only. A locale identifies a region and its conventions and has a more cultural context.

To illustrate the difference between language IDs and locale IDs, consider the following example. The dialect for a resident of Great Britain is specified by the code en-GB. The commonly used locale for that same person is en_GB. If you wanted to be very precise when specifying the locale, you could specify the locale code as en-GB_GB. This specifies a person who speaks the British dialect of English and who resides in Great Britain. If that same person moved to the United States, the appropriate locale would be en-GB_US, which would identify a person who speaks British English but uses the regional settings associated with the United States.

In OS X v10.4 and later (and iOS), you can use the language ID tags defined in the BCP 47 specification. In addition to the ISO 3166-1 region codes, the draft of this standard (available at http://www.rfc-editor.org/) adds support for tags ranging in length from 3 to 8 characters. The use of these tags makes it possible to separate dialect or script information from a specific region or country.

Particularly in Chinese dialects, a region code is not always the best way to specify the proper dialect or script. For example, traditional Chinese (Han) is the default language spoken in Taiwan and is identified by the code zh_TW in OS X v10.3.9 and earlier. However, traditional Chinese is also commonly spoken in Hong Kong and Macao, which means the zh_TW designator is not entirely accurate in those locations. The new standard defines new tags for the traditional Chinese (Hant) and simplified Chinese (Hans) scripts. Thus, traditional Chinese spoken in any country uses the code zh-Hant. Traditional Chinese, as it is spoken in Taiwan, now uses the locale code zh-Hant_TW.

Table 3 lists some of the other custom tags that identify a particular dialect or script.

Table 3  Custom language ID tags

Language ID

Description

az-Arab

Azerbaijani in the Arabic script.

az-Cyrl

Azerbaijani in the Cyrillic script.

az-Latn

Azerbaijani in the Latin script.

sr-Cyrl

Serbian in the Cyrillic script.

sr-Latn

Serbian in the Latin script.

uz-Cyrl

Uzbek in the Cyrillic script.

uz-Latn

Uzbek in the Latin script.

zh-Hans

Chinese in the simplified script.

zh-Hant

Chinese in the traditional script.

Language-Specific Project Directories

The more general you make your localized resources, the more regions you can support with a single set of resources. This can save a lot of space in your bundle and helps reduce translation costs. For example, if you did not need to distinguish between different regions of the English language, you could include a single en.lproj directory to support users in the United States, Great Britain, and Australia. More importantly, you must use a single language directory on platforms such as iOS, which do not recognize dialect-specific resource files.

When searching for resources, the system bundle routines try to find the best match between the .lproj directories in your bundle and the user’s language and region preferences. In iOS, the bundle routines look for the requested resource in the generic language directory for the user’s preferred language, followed by any other generic language directories. In OS X, the bundle routines look for the requested resource in any region-specific directories first, followed by more generalized language directories. For example, if your Mac app had localizations for United States, Great Britain, and Australian users, the bundle routines would search the appropriate region directory (en_US.lproj, en_GB.lproj, or en_AU.lproj) first, followed by the en.lproj directory. The same application on the iPhone would look only in the en.lproj directory.

For more information about how the bundle routines locate resources, see ““Accessing a Bundle's Contents”” in Bundle Programming Guide.

Getting Language Names from Designators

Few users can recognize languages by their ISO designators. If you need to display the actual name of a language to a user, you can use the NSLocale method displayNameForKey:value: to get the correct display name for the language or locale ID.

NSString *identifier = [[NSLocale currentLocale] localeIdentifier];
NSString *displayName = [[NSLocale currentLocale] displayNameForKey:NSLocaleIdentifier value:identifier];

Using Custom Designators

It is possible (albeit discouraged) to use a language or locale abbreviation that is not known to the NSBundle class or Core Foundation bundle functions. For example, you could create your own language designations for a language that is not yet listed in the ISO conventions.

If you choose to create a new designator, be sure to follow the rules found in sections 2.2.1 and 4.5 of BCP 47. Tags that do not follow these conventions are not guaranteed to work. When using custom tags, you must ensure that the abbreviation stored by the user’s language preferences matches the designator used by your .lproj directory exactly.

Legacy Language Designators

In addition to the ISO language designators, the bundle routines also recognize several legacy language designators. These designators let you specify a language by a user-readable name, instead of by a two or three character code. Designators included names such as English, French, German, Japanese, Chinese, Spanish, Italian, Swedish, and Portuguese among others. Although these names are still recognized and processed by the NSBundle class and Core Foundation bundle functions, their use is deprecated and support for them in future versions of OS X or iOS is not guaranteed. Use the codes described in “Language Designations” and “Regional Designations” instead.