NSCharacterSet Class Reference
| Inherits from | |
| Conforms to | |
| Framework | /System/Library/Frameworks/Foundation.framework |
| Availability | Available in iOS 2.0 and later. |
| Companion guide | |
| Declared in | NSCharacterSet.h |
Overview
An NSCharacterSet object represents a set of Unicode-compliant characters. NSString and NSScanner objects use NSCharacterSet objects to group characters together for searching operations, so that they can find any of a particular set of characters during a search. The cluster’s two public classes, NSCharacterSet and NSMutableCharacterSet, declare the programmatic interface for static and dynamic character sets, respectively.
The objects you create using these classes are referred to as character set objects (and when no confusion will result, merely as character sets). Because of the nature of class clusters, character set objects aren’t actual instances of the NSCharacterSet or NSMutableCharacterSet classes but of one of their private subclasses. Although a character set object’s class is private, its interface is public, as declared by these abstract superclasses, NSCharacterSet and NSMutableCharacterSet. The character set classes adopt the NSCopying and NSMutableCopying protocols, making it convenient to convert a character set of one type to the other.
The NSCharacterSet class declares the programmatic interface for an object that manages a set of Unicode characters (see the NSString class cluster specification for information on Unicode). NSCharacterSet’s principal primitive method, characterIsMember:, provides the basis for all other instance methods in its interface. A subclass of NSCharacterSet needs only to implement this method, plus mutableCopyWithZone:, for proper behavior. For optimal performance, a subclass should also override bitmapRepresentation, which otherwise works by invoking characterIsMember: for every possible Unicode value.
NSCharacterSet is “toll-free bridged” with its Cocoa Foundation counterpart, CFCharacterSetRef. See “Toll-Free Bridging” for more information on toll-free bridging.
The mutable subclass of NSCharacterSet is NSMutableCharacterSet.
Adopted Protocols
Tasks
Creating a Standard Character Set
-
+ alphanumericCharacterSet -
+ capitalizedLetterCharacterSet -
+ controlCharacterSet -
+ decimalDigitCharacterSet -
+ decomposableCharacterSet -
+ illegalCharacterSet -
+ letterCharacterSet -
+ lowercaseLetterCharacterSet -
+ newlineCharacterSet -
+ nonBaseCharacterSet -
+ punctuationCharacterSet -
+ symbolCharacterSet -
+ uppercaseLetterCharacterSet -
+ whitespaceAndNewlineCharacterSet -
+ whitespaceCharacterSet
Creating a Custom Character Set
Creating and Managing Character Sets as Bitmap Representations
Testing Set Membership
Class Methods
alphanumericCharacterSet
Returns a character set containing the characters in the categories Letters, Marks, and Numbers.
Return Value
A character set containing the characters in the categories Letters, Marks, and Numbers.
Discussion
Informally, this set is the set of all characters used as basic units of alphabets, syllabaries, ideographs, and digits.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hcapitalizedLetterCharacterSet
Returns a character set containing the characters in the category of Titlecase Letters.
Return Value
A character set containing the characters in the category of Titlecase Letters.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hcharacterSetWithBitmapRepresentation:
Returns a character set containing characters determined by a given bitmap representation.
Parameters
- data
A bitmap representation of a character set.
Return Value
A character set containing characters determined by data.
Discussion
This method is useful for creating a character set object with data from a file or other external data source.
A raw bitmap representation of a character set is a byte array of 2^16 bits (that is, 8192 bytes). The value of the bit at position n represents the presence in the character set of the character with decimal Unicode value n. To add a character with decimal Unicode value n to a raw bitmap representation, use a statement such as the following:
unsigned char bitmapRep[8192]; |
bitmapRep[n >> 3] |= (((unsigned int)1) << (n & 7)); |
To remove that character:
bitmapRep[n >> 3] &= ~(((unsigned int)1) << (n & 7)); |
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hcharacterSetWithCharactersInString:
Returns a character set containing the characters in a given string.
Parameters
- aString
A string containing characters for the new character set.
Return Value
A character set containing the characters in aString. Returns an empty character set if aString is empty.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hcharacterSetWithContentsOfFile:
Returns a character set read from the bitmap representation stored in the file a given path.
Parameters
- path
A path to a file containing a bitmap representation of a character set. The path name must end with the extension
.bitmap.
Return Value
A character set read from the bitmap representation stored in the file at path.
Discussion
To read a bitmap representation from any file, use the NSData methoddataWithContentsOfFile:options:error: and pass the result to characterSetWithBitmapRepresentation:.
This method doesn’t use filenames to check for the uniqueness of the character sets it creates. To prevent duplication of character sets in memory, cache them and make them available through an API that checks whether the requested set has already been loaded.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hcharacterSetWithRange:
Returns a character set containing characters with Unicode values in a given range.
Parameters
- aRange
A range of Unicode values.
aRange
.locationis the value of the first character to return; aRange.location +aRange.length– 1is the value of the last.
Return Value
A character set containing characters whose Unicode values are given by aRange. If aRange.length is 0, returns an empty character set.
Discussion
This code excerpt creates a character set object containing the lowercase English alphabetic characters:
NSRange lcEnglishRange; |
NSCharacterSet *lcEnglishLetters; |
lcEnglishRange.location = (unsigned int)'a'; |
lcEnglishRange.length = 26; |
lcEnglishLetters = [NSCharacterSet characterSetWithRange:lcEnglishRange]; |
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hcontrolCharacterSet
Returns a character set containing the characters in the categories of Control or Format Characters.
Return Value
A character set containing the characters in the categories of Control or Format Characters.
Discussion
These characters are specifically the Unicode values U+0000 to U+001F and U+007F to U+009F.
Availability
- Available in iOS 2.0 and later.
See Also
Declared In
NSCharacterSet.hdecimalDigitCharacterSet
Returns a character set containing the characters in the category of Decimal Numbers.
Return Value
A character set containing the characters in the category of Decimal Numbers.
Discussion
Informally, this set is the set of all characters used to represent the decimal values 0 through 9. These characters include, for example, the decimal digits of the Indic scripts and Arabic.
Availability
- Available in iOS 2.0 and later.
See Also
Declared In
NSCharacterSet.hdecomposableCharacterSet
Returns a character set containing all individual Unicode characters that can also be represented as composed character sequences.
Return Value
A character set containing all individual Unicode characters that can also be represented as composed character sequences (such as for letters with accents), by the definition of “standard decomposition” in version 3.2 of the Unicode character encoding standard.
Discussion
These characters include compatibility characters as well as pre-composed characters.
Availability
- Available in iOS 2.0 and later.
See Also
Declared In
NSCharacterSet.hillegalCharacterSet
Returns a character set containing values in the category of Non-Characters or that have not yet been defined in version 3.2 of the Unicode standard.
Return Value
A character set containing values in the category of Non-Characters or that have not yet been defined in version 3.2 of the Unicode standard.
Availability
- Available in iOS 2.0 and later.
See Also
Declared In
NSCharacterSet.hletterCharacterSet
Returns a character set containing the characters in the categories Letters and Marks.
Return Value
A character set containing the characters in the categories Letters and Marks.
Discussion
Informally, this set is the set of all characters used as letters of alphabets and ideographs.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hlowercaseLetterCharacterSet
Returns a character set containing the characters in the category of Lowercase Letters.
Return Value
A character set containing the characters in the category of Lowercase Letters.
Discussion
Informally, this set is the set of all characters used as lowercase letters in alphabets that make case distinctions.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hnewlineCharacterSet
Returns a character set containing the newline characters.
Return Value
A character set containing the newline characters (U+000A–U+000D, U+0085).
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hnonBaseCharacterSet
Returns a character set containing the characters in the category of Marks.
Return Value
A character set containing the characters in the category of Marks.
Discussion
This set is also defined as all legal Unicode characters with a non-spacing priority greater than 0. Informally, this set is the set of all characters used as modifiers of base characters.
Availability
- Available in iOS 2.0 and later.
See Also
Declared In
NSCharacterSet.hpunctuationCharacterSet
Returns a character set containing the characters in the category of Punctuation.
Return Value
A character set containing the characters in the category of Punctuation.
Discussion
Informally, this set is the set of all non-whitespace characters used to separate linguistic units in scripts, such as periods, dashes, parentheses, and so on.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hsymbolCharacterSet
Returns a character set containing the characters in the category of Symbols.
Return Value
A character set containing the characters in the category of Symbols.
Discussion
These characters include, for example, the dollar sign ($) and the plus (+) sign.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.huppercaseLetterCharacterSet
Returns a character set containing the characters in the categories of Uppercase Letters and Titlecase Letters.
Return Value
A character set containing the characters in the categories of Uppercase Letters and Titlecase Letters.
Discussion
Informally, this set is the set of all characters used as uppercase letters in alphabets that make case distinctions.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hwhitespaceAndNewlineCharacterSet
Returns a character set containing only the whitespace characters space (U+0020) and tab (U+0009) and the newline and nextline characters (U+000A–U+000D, U+0085).
Return Value
A character set containing only the whitespace characters space (U+0020) and tab (U+0009) and the newline and nextline characters (U+000A–U+000D, U+0085).
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hwhitespaceCharacterSet
Returns a character set containing only the in-line whitespace characters space (U+0020) and tab (U+0009).
Return Value
A character set containing only the in-line whitespace characters space (U+0020) and tab (U+0009).
Discussion
This set doesn’t contain the newline or carriage return characters.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hInstance Methods
bitmapRepresentation
Returns an NSData object encoding the receiver in binary format.
Return Value
An NSData object encoding the receiver in binary format.
Discussion
This format is suitable for saving to a file or otherwise transmitting or archiving.
A raw bitmap representation of a character set is a byte array of 2^16 bits (that is, 8192 bytes). The value of the bit at position n represents the presence in the character set of the character with decimal Unicode value n. To test for the presence of a character with decimal Unicode value n in a raw bitmap representation, use an expression such as the following:
unsigned char bitmapRep[8192]; |
if (bitmapRep[n >> 3] & (((unsigned int)1) << (n & 7))) { |
/* Character is present. */ |
} |
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hcharacterIsMember:
Returns a Boolean value that indicates whether a given character is in the receiver.
Parameters
- aCharacter
The character to test for membership of the receiver.
Return Value
YES if aCharacter is in the receiving character set, otherwise NO.
Availability
- Available in iOS 2.0 and later.
See Also
Declared In
NSCharacterSet.hhasMemberInPlane:
Returns a Boolean value that indicates whether the receiver has at least one member in a given character plane.
Parameters
- thePlane
A character plane.
Return Value
YES if the receiver has at least one member in thePlane, otherwise NO.
Discussion
This method makes it easier to find the plane containing the members of the current character set. The Basic Multilingual Plane is plane 0.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hinvertedSet
Returns a character set containing only characters that don’t exist in the receiver.
Return Value
A character set containing only characters that don’t exist in the receiver.
Discussion
Inverting an immutable character set is much more efficient than inverting a mutable character set.
Availability
- Available in iOS 2.0 and later.
See Also
-
invert(NSMutableCharacterSet)
Declared In
NSCharacterSet.hisSupersetOfSet:
Returns a Boolean value that indicates whether the receiver is a superset of another given character set.
Parameters
- theOtherSet
A character set.
Availability
- Available in iOS 2.0 and later.
Declared In
NSCharacterSet.hlongCharacterIsMember:
Returns a Boolean value that indicates whether a given long character is a member of the receiver.
Parameters
- theLongChar
A UTF32 character.
Discussion
This method supports the specification of 32-bit characters.
Availability
- Available in iOS 2.0 and later.
See Also
Declared In
NSCharacterSet.hConstants
NSOpenStepUnicodeReservedBase
Specifies lower bound for a Unicode character range reserved for Apple’s corporate use.
enum {
NSOpenStepUnicodeReservedBase = 0xF400
};
Constants
NSOpenStepUnicodeReservedBaseSpecifies lower bound for a Unicode character range reserved for Apple’s corporate use (the range is
0xF400–0xF8FF).Available in iOS 2.0 and later.
Declared in
NSCharacterSet.h.
Declared In
NSCharacterSet.h© 2008 Apple Inc. All Rights Reserved. (Last updated: 2008-10-15)