Type Property

language

Supplies the language for a token, if one can be determined.

Declaration

static let language: NSLinguisticTagScheme

Discussion

Each value for this tag scheme is a BCP-47 language identifier. For example, the language identifier for English is "en" and the identifier for Chinese written using the Simplified Chinese script is "zh-Hans". The identifier "und" is used if a specific language cannot be determined.

The tagger generally attempts to determine the language of text at the level of an entire sentence, paragraph, or document, rather than word by word.

See Also

Schemes

static let tokenType: NSLinguisticTagScheme

Classifies tokens according to their broad type: word, punctuation, or whitespace.

static let lexicalClass: NSLinguisticTagScheme

Classifies tokens according to class: part of speech, type of punctuation, or whitespace.

static let nameType: NSLinguisticTagScheme

Classifies tokens according to whether they are part of a named entity.

static let nameTypeOrLexicalClass: NSLinguisticTagScheme

Classifies tokens corresponding to names according to nameType, and classifies all other tokens according to lexicalClass.

static let lemma: NSLinguisticTagScheme

Supplies a stem form of a word token, if known.

static let script: NSLinguisticTagScheme

Supplies the script for a token, if one can be determined.