NSLinguisticTagger Class Reference
| Inherits from | |
| Conforms to | |
| Framework | /System/Library/Frameworks/Foundation.framework |
| Availability | Available in iOS 5.0 and later. |
| Declared in | NSLinguisticTagger.h |
Overview
The NSLinguisticTagger class is used to automatically segment natural-language text and tag it with information, such as parts of speech. It can also tag languages, scripts, stem forms of words, etc. An instance of this class is assigned a string to tag, and clients can then obtain tags and ranges for tokens in that string appropriate to a given tag scheme.
Thread Safety
A given instance of NSLinguisticTagger should not be used from more than one thread simultaneously.
Tasks
Creating a Linguistic Tagger
Getting the Tag Schemes
Getting and Setting the Analyzed String
Getting and Setting Orthography
Enumerating Linguistic Tags
-
– enumerateTagsInRange:scheme:options:usingBlock: -
– possibleTagsAtIndex:scheme:tokenRange:sentenceRange:scores: -
– tagAtIndex:scheme:tokenRange:sentenceRange: -
– tagsInRange:scheme:options:tokenRanges:
Determining a Sentence for a Range
Class Methods
availableTagSchemesForLanguage:
Returns the tag schemes supported by the linguistic tagger for a particular language.
Parameters
- language
A standard abbreviation as with
NSOrthography.
Return Value
An array of “Linguistic Tag Schemes.”
Discussion
Clients wishing to know the tag schemes supported for a NSLinguisticTagger instance for a particular language may query them with this method. The language should be specified using a standard abbreviation as with NSOrthography.
Availability
- Available in iOS 5.0 and later.
Declared In
NSLinguisticTagger.hInstance Methods
enumerateTagsInRange:scheme:options:usingBlock:
Enumerates the specific range of the string, providing the Block with the located tags.
Parameters
- range
The range to analyze
- tagScheme
The tag scheme.
- opts
The linguistic tagger options to use. See “NSLinguisticTaggerOptions” for the constants. These constants can be combined using the C Bitwise operator.
- block
The Block to apply to ranges of the string.
The Block takes four arguments:
- tag
The located linguistic tag.
- tokenRange
The range of the linguistic tag.
- sentenceRange
The range of the sentence in which the tag occurs.
- stop
A reference to a Boolean value. The block can set the value to
YESto stop further processing of the set. Thestopargument is an out-only argument. You should only ever set this Boolean toYESwithin the Block.
Discussion
The tagger will segment the string as needed into sentences and tokens, and return those ranges along with a tag for any scheme in its array of tag schemes.
This is the fundamental tagging method of NSLinguisticTagger. This method’s block iterates over all tokens intersecting a given range, supplying tags and ranges. There are several additional convenience methods, for obtaining a sentence range, information about a single token, or information about all tokens intersecting a given range at once.
For example, if the tag scheme is NSLinguisticTagSchemeLexicalClass, the tags will specify the part of speech (for word tokens) or the type of whitespace or punctuation (for whitespace or punctuation tokens). If the tag scheme is NSLinguisticTagSchemeLemma, the tags will specify the stem form of the word (if known) for each word token.
It is important to note that this method will return the ranges of all tokens that intersect the given range.
Availability
- Available in iOS 5.0 and later.
See Also
Declared In
NSLinguisticTagger.hinitWithTagSchemes:options:
Creates a linguistic tagger instance using the specified tag schemes and options.
Parameters
- tagSchemes
An array of tag schemes. See “Linguistic Tag Schemes” for the possible values.
- opts
The linguistic tagger options to use. See “NSLinguisticTaggerOptions” for the constants. These constants can be combined using the C-Bitwise OR operator.
Return Value
An initialized linguistic tagger.
Availability
- Available in iOS 5.0 and later.
Declared In
NSLinguisticTagger.horthographyAtIndex:effectiveRange:
Returns the orthography at the index and also returns the effective range.
Parameters
- charIndex
The character index to begin examination.
- effectiveRange
An NSRangePointer that, upon completion, contains the range of the orthography containing charIndex.
Return Value
The orthography for the location.
Availability
- Available in iOS 5.0 and later.
See Also
Declared In
NSLinguisticTagger.hpossibleTagsAtIndex:scheme:tokenRange:sentenceRange:scores:
Returns an array of possible tags for the given scheme at the specified range, supplying matching scores.
Parameters
- charIndex
The initial character index.
- tagScheme
The tag scheme. See “Linguistic Tag Schemes” for the possible values.
- tokenRange
The token range.
- sentenceRange
The range of the sentence.
- scores
Returns by-reference an array of numeric scores (wrapped as NSValue objects) indicating the likelihood that the range matches the tag scheme.
Return Value
Returns an array of possible tags for the tagScheme at the specified location, starting with the most likely tag scheme. For some tag schemes only a single tag will be returned, but for others a list of possibilities will be provided.
Availability
- Available in iOS 5.0 and later.
See Also
Declared In
NSLinguisticTagger.hsentenceRangeForRange:
Returns the range of a sentence boundary containing the specified range.
Parameters
- charRange
The range.
Return Value
Returns the range of a sentence that contains charRange.
Discussion
This method can be used to obtain the enclosing sentence range given a token range.
Availability
- Available in iOS 5.0 and later.
Declared In
NSLinguisticTagger.hsetOrthography:range:
Sets the orthography for the specified range.
Parameters
- orthography
The orthography.
- charRange
The range.
Discussion
If the orthography of the linguistic tagger is not set, it will determine it automatically from the contents of the text. Clients should call this method only if they already know the language of the text by some other means.
Availability
- Available in iOS 5.0 and later.
See Also
Declared In
NSLinguisticTagger.hsetString:
Sets the string to be analyzed by the linguistic tagger.
Parameters
- string
The string.
Availability
- Available in iOS 5.0 and later.
See Also
Declared In
NSLinguisticTagger.hstring
Returns the string being analyzed by the linguistic tagger.
Return Value
The string.
Availability
- Available in iOS 5.0 and later.
See Also
Declared In
NSLinguisticTagger.hstringEditedInRange:changeInLength:
Notifies the linguistic tagger that the string (if mutable) has changed as specified by the parameters.
Parameters
- newCharRange
The range in the final string that was edited.
- delta
The change in length.
Availability
- Available in iOS 5.0 and later.
See Also
Declared In
NSLinguisticTagger.htagAtIndex:scheme:tokenRange:sentenceRange:
Returns a tag for a single scheme at the specified index.
Parameters
- charIndex
The initial character index.
- tagScheme
The tag scheme. See “Linguistic Tag Schemes” for the possible values.
- tokenRange
A pointer to the token range. If
NULL, no pointer range is returned.- sentenceRange
A pointer to the range of the sentence. If
NULL, no pointer range is returned.
Return Value
Returns the tag for the requested tagScheme. There are cases in which there may not be a tag for a given scheme and token, in which case the return value of the method would be nil.
Discussion
When the returned array contains entries that do not have a corresponding tagScheme, that entry is an instance of NSNull.
Availability
- Available in iOS 5.0 and later.
See Also
Declared In
NSLinguisticTagger.htagSchemes
Returns the tag schemes supported by the linguistic tagger for a particular language.
Return Value
An array of tag schemes. See “Linguistic Tag Schemes” for the possible values.
Availability
- Available in iOS 5.0 and later.
Declared In
NSLinguisticTagger.htagsInRange:scheme:options:tokenRanges:
Returns an array of linguistic tags and token ranges.
Parameters
- range
The range from which to return tags.
- tagScheme
The tag scheme. See “Linguistic Tag Schemes” for the possible values.
- opts
The linguistic tagger options to use. See “NSLinguisticTaggerOptions” for the constants. These constants can be combined using the C-Bitwise OR operator.
- tokenRanges
Returns by-reference an array of token range objects wrapped in
NSValueobjects.
Return Value
An array of the tag schemes corresponding to the entries in the tokenRanges array.
Availability
- Available in iOS 5.0 and later.
See Also
Declared In
NSLinguisticTagger.hConstants
NSLinguisticTaggerOptions
These constants specify the linguistic tagger options. They can be combined using the C-Bitwise OR operator.
enum {
NSLinguisticTaggerOmitWords = 1 << 0,
NSLinguisticTaggerOmitPunctuation = 1 << 1,
NSLinguisticTaggerOmitWhitespace = 1 << 2,
NSLinguisticTaggerOmitOther = 1 << 3,
NSLinguisticTaggerJoinNames = 1 << 4
};
typedef NSUInteger NSLinguisticTaggerOptions;
Constants
NSLinguisticTaggerOmitWordsOmit tokens of type NSLinguisticTagWord (items considered to be words).
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTaggerOmitPunctuationOmit tokens of type NSLinguisticTagPunctuation (all punctuation).
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTaggerOmitWhitespaceOmit tokens of type NSLinguisticTagWhitespace (whitespace of all sorts).
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTaggerOmitOtherOmit tokens of type NSLinguisticTagOther (non-linguistic items such as symbols).
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTaggerJoinNamesTypically, multiple-word names will be returned as multiple tokens, following the standard tokenization practice of the tagger. If this option is set, then multiple-word names will be joined together and returned as a single token.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.
Linguistic Tag Schemes
These constants specify the linguistic tag schemes used by initWithTagSchemes:options: to create the linguistic tagger instance. The method tagSchemes returns an array of the schemes the instance was created with.
NSString *const NSLinguisticTagSchemeTokenType; NSString *const NSLinguisticTagSchemeLexicalClass; NSString *const NSLinguisticTagSchemeNameType; NSString *const NSLinguisticTagSchemeNameTypeOrLexicalClass; NSString *const NSLinguisticTagSchemeLemma; NSString *const NSLinguisticTagSchemeLanguage; NSString *const NSLinguisticTagSchemeScript;
Constants
NSLinguisticTagSchemeTokenTypeThis tag scheme classifies tokens according to their broad type: word, punctuation, whitespace, etc. The possible tags are:
NSLinguisticTagWord,NSLinguisticTagPunctuation,NSLinguisticTagWhitespace, orNSLinguisticTagOther. For this scheme a client may use pointer equality to compare the values with the tag constants.Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagSchemeLexicalClassThis tag scheme classifies tokens according to class: part of speech for words, type of punctuation or whitespace, etc. The value will be one of the constants specified in “NSLinguisticTagSchemeLexicalClass.” For this scheme a client may use pointer equality to compare the values with the tag constants.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagSchemeNameTypeThis tag scheme classifies tokens as to whether they are part of named entities of various types or not. The possible tags are:
NSLinguisticTagPersonalName,NSLinguisticTagPlaceName, orNSLinguisticTagOrganizationName. For this scheme a client may use pointer equality to compare the values with the tag constants.Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagSchemeNameTypeOrLexicalClassThis tag scheme follows
NSLinguisticTagSchemeNameTypefor names,NSLinguisticTagSchemeLexicalClassfor all other tokens. The possible tags are those specified in “NSLinguisticTagSchemeLexicalClass” or “NSLinguisticTagSchemeNameType.” For this scheme a client may use pointer equality to compare the values with the tag constants.Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagSchemeLemmaThis tag scheme supplies a stem forms of the words, if known.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagSchemeLanguageThis tag scheme tags tokens according to their script. The tag values will be standard language abbreviations such as “en”, “fr”, “de”, etc., as used with the
NSOrthographyclass. Note that the tagger generally attempts to determine the language of text at the level of an entire sentence or paragraph, rather than word by word.Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagSchemeScriptThis tag scheme tags tokens according to their script. The tag values will be standard script abbreviations such as “Latn”, “Cyrl”, “Jpan”, “Hans”, “Hant”, etc.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.
NSLinguisticTagSchemeTokenTypes
These constants return the linguistic token type according to their broad type.
NSString *const NSLinguisticTagWord; NSString *const NSLinguisticTagPunctuation; NSString *const NSLinguisticTagWhitespace; NSString *const NSLinguisticTagOther;
Constants
NSLinguisticTagWordThe token indicates a word.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagPunctuationThe token indicates punctuation.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagWhitespaceThe token indicates white space of any sort.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagOtherThe token indicates a token other than those currently defined.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.
NSLinguisticTagSchemeLexicalClass
These constants specify the lexical class of a token.
NSString *const NSLinguisticTagNoun; NSString *const NSLinguisticTagVerb; NSString *const NSLinguisticTagAdjective; NSString *const NSLinguisticTagAdverb; NSString *const NSLinguisticTagPronoun; NSString *const NSLinguisticTagDeterminer; NSString *const NSLinguisticTagParticle; NSString *const NSLinguisticTagPreposition; NSString *const NSLinguisticTagNumber; NSString *const NSLinguisticTagConjunction; NSString *const NSLinguisticTagInterjection; NSString *const NSLinguisticTagClassifier; NSString *const NSLinguisticTagIdiom; NSString *const NSLinguisticTagOtherWord; NSString *const NSLinguisticTagSentenceTerminator; NSString *const NSLinguisticTagOpenQuote; NSString *const NSLinguisticTagCloseQuote; NSString *const NSLinguisticTagOpenParenthesis; NSString *const NSLinguisticTagCloseParenthesis; NSString *const NSLinguisticTagWordJoiner; NSString *const NSLinguisticTagDash; NSString *const NSLinguisticTagOtherPunctuation; NSString *const NSLinguisticTagParagraphBreak; NSString *const NSLinguisticTagOtherWhitespace;
Constants
NSLinguisticTagNounThe token is a noun.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagVerbThis token is a verb.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagAdjectiveThis token is an adjective
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagAdverbThis token is an adverb.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagPronounThis token is a pronoun.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagDeterminerThis token is a determiner.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagParticleThis token is a particle.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagPrepositionThis token is a preposition.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagNumberThis token is a number.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagConjunctionThis token is a conjunction.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagInterjectionThis token is an interjection.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagClassifierThis token is a classifier.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagIdiomThis token is an idiom.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagOtherWordThis token is some other word.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagSentenceTerminatorThis token is a sentence terminator.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagOpenQuoteThis token is an open quote.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagCloseQuoteThis token is a close quote.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagOpenParenthesisThis token is an open parenthesis.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagCloseParenthesisThis token is a close parenthesis.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagWordJoinerThis token is a word joiner.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagDashThis token is a dash.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagOtherPunctuationThis token is punctuation not recognized as another token type.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagParagraphBreakThis token is a paragraph break.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagOtherWhitespaceThis token is whitespace.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.
NSLinguisticTagSchemeNameType
These constants define linguistic tags for specific types of words: people, places, and organizations.
NSString *const NSLinguisticTagPersonalName; NSString *const NSLinguisticTagPlaceName; NSString *const NSLinguisticTagOrganizationName;
Constants
NSLinguisticTagPersonalNameThis token is a personal name.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagPlaceNameThis token is a place name.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.NSLinguisticTagOrganizationNameThis token is an organization name.
Available in iOS 5.0 and later.
Declared in
NSLinguisticTagger.h.
© 2011 Apple Inc. All Rights Reserved. (Last updated: 2011-10-12)