Guesses a language of a given string and returns the guess as a BCP 47 string.
SDKs
- iOS 3.0+
- macOS 10.5+
- Mac Catalyst 13.0+
- tvOS 9.0+
- watchOS 2.0+
Framework
- Core Foundation
Declaration
CFString Ref CFStringTokenizerCopyBestStringLanguage(CFString Ref string, CFRange range);
Parameters
string
The string to test to identify the language.
range
The range of
string
to use for the test. IfNULL
, the first few hundred characters of the string are examined.
Return Value
A language in BCP 47 form, or NULL
if the language in string
could not be identified. Ownership follows the The Create Rule.
Discussion
The result is not guaranteed to be accurate. Typically, the function requires 200-400 characters to reliably guess the language of a string.
CFStringTokenizer recognizes the following languages:
ar (Arabic), bg (Bulgarian), cs (Czech), da (Danish), de (German), el (Greek), en (English), es (Spanish), fi (Finnish), fr (French), he (Hebrew), hr (Croatian), hu (Hungarian), is (Icelandic), it (Italian), ja (Japanese), ko (Korean), nb (Norwegian Bokmål), nl (Dutch), pl (Polish), pt (Portuguese), ro (Romanian), ru (Russian), sk (Slovak), sv (Swedish), th (Thai), tr (Turkish), uk (Ukrainian), zh-Hans (Simplified Chinese), zh-Hant (Traditional Chinese).