iOS Developer Library

Developer

Foundation Framework Reference NSCharacterSet Class Reference

Options
Deployment Target:

On This Page
Language:

NSCharacterSet

An NSCharacterSet object represents a set of Unicode-compliant characters. NSString and NSScanner objects use NSCharacterSet objects to group characters together for searching operations, so that they can find any of a particular set of characters during a search. The cluster’s two public classes, NSCharacterSet and NSMutableCharacterSet, declare the programmatic interface for static and dynamic character sets, respectively.

The objects you create using these classes are referred to as character set objects (and when no confusion will result, merely as character sets). Because of the nature of class clusters, character set objects aren’t actual instances of the NSCharacterSet or NSMutableCharacterSet classes but of one of their private subclasses. Although a character set object’s class is private, its interface is public, as declared by these abstract superclasses, NSCharacterSet and NSMutableCharacterSet. The character set classes adopt the NSCopying and NSMutableCopying protocols, making it convenient to convert a character set of one type to the other.

The NSCharacterSet class declares the programmatic interface for an object that manages a set of Unicode characters (see the NSString class cluster specification for information on Unicode). NSCharacterSet’s principal primitive method, characterIsMember:, provides the basis for all other instance methods in its interface. A subclass of NSCharacterSet needs only to implement this method, plus mutableCopyWithZone:, for proper behavior. For optimal performance, a subclass should also override bitmapRepresentation, which otherwise works by invoking characterIsMember: for every possible Unicode value.

NSCharacterSet is “toll-free bridged” with its Core Foundation counterpart, CFCharacterSetRef. See Toll-Free Bridging for more information on toll-free bridging.

The mutable subclass of NSCharacterSet is NSMutableCharacterSet.

  • Returns a character set containing the characters in Unicode General Categories L*, M*, and N*.

    Declaration

    Swift

    class func alphanumericCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)alphanumericCharacterSet

    Return Value

    A character set containing all the alphanumeric characters.

    Discussion

    Informally, this set is the set of all characters used as basic units of alphabets, syllabaries, ideographs, and digits.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in Unicode General Category Lt.

    Declaration

    Swift

    class func capitalizedLetterCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)capitalizedLetterCharacterSet

    Return Value

    A character set containing all the capitalized letter characters.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in Unicode General Category Cc and Cf.

    Declaration

    Swift

    class func controlCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)controlCharacterSet

    Return Value

    A character set containing all the control characters.

    Discussion

    These characters include, for example, the soft hyphen (U+00AD), control characters to support bi-directional text, and IETF language tag characters.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in the category of Decimal Numbers.

    Declaration

    Swift

    class func decimalDigitCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)decimalDigitCharacterSet

    Return Value

    A character set containing all the decimal digit characters.

    Discussion

    Informally, this set is the set of all characters used to represent the decimal values 0 through 9. These characters include, for example, the decimal digits of the Indic scripts and Arabic.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing individual Unicode characters that can also be represented as composed character sequences (such as for letters with accents), by the definition of “standard decomposition” in version 3.2 of the Unicode character encoding standard.

    Declaration

    Swift

    class func decomposableCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)decomposableCharacterSet

    Return Value

    A character set containing all the decomposable characters.

    Discussion

    These characters include compatibility characters as well as pre-composed characters.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing values in the category of Non-Characters or that have not yet been defined in version 3.2 of the Unicode standard.

    Declaration

    Swift

    class func illegalCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)illegalCharacterSet

    Return Value

    A character set containing all the illegal characters.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in Unicode General Category L* & M*.

    Declaration

    Swift

    class func letterCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)letterCharacterSet

    Return Value

    A character set containing all the letter characters.

    Discussion

    Informally, this set is the set of all characters used as letters of alphabets and ideographs.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in Unicode General Category Ll.

    Declaration

    Swift

    class func lowercaseLetterCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)lowercaseLetterCharacterSet

    Return Value

    A character set containing all the lowercase letter characters.

    Discussion

    Informally, this set is the set of all characters used as lowercase letters in alphabets that make case distinctions.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the newline characters (U+000A ~ U+000D, U+0085, U+2028, and U+2029)

    Declaration

    Swift

    class func newlineCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)newlineCharacterSet

    Return Value

    A character set containing all the newline characters.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in Unicode General Category M*.

    Declaration

    Swift

    class func nonBaseCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)nonBaseCharacterSet

    Return Value

    A character set containing all the non-base characters.

    Discussion

    This set is also defined as all legal Unicode characters with a non-spacing priority greater than 0. Informally, this set is the set of all characters used as modifiers of base characters.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in Unicode General Category P*.

    Declaration

    Swift

    class func punctuationCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)punctuationCharacterSet

    Return Value

    A character set containing all the punctuation characters.

    Discussion

    Informally, this set is the set of all non-whitespace characters used to separate linguistic units in scripts, such as periods, dashes, parentheses, and so on.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in Unicode General Category S*.

    Declaration

    Swift

    class func symbolCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)symbolCharacterSet

    Return Value

    A character set containing all the symbol characters.

    Discussion

    These characters include, for example, the dollar sign ($) and the plus (+) sign.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in Unicode General Category Lu and Lt.

    Declaration

    Swift

    class func uppercaseLetterCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)uppercaseLetterCharacterSet

    Return Value

    A character set containing all the uppercase letter characters.

    Discussion

    Informally, this set is the set of all characters used as uppercase letters in alphabets that make case distinctions.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing characters in Unicode General Category Z*, U+000A ~ U+000D, and U+0085.

    Declaration

    Swift

    class func whitespaceAndNewlineCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)whitespaceAndNewlineCharacterSet

    Return Value

    A character set containing all the whitespace and newline characters.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing the characters in Unicode General Category Zs and CHARACTER TABULATION (U+0009).

    Declaration

    Swift

    class func whitespaceCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)whitespaceCharacterSet

    Return Value

    A character set containing all the whitespace characters.

    Discussion

    This set doesn’t contain the newline or carriage return characters.

    Availability

    Available in iOS 2.0 and later.

  • Returns the character set for characters allowed in a fragment URL component.

    Declaration

    Swift

    class func URLFragmentAllowedCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)URLFragmentAllowedCharacterSet

    Discussion

    The fragment component of a URL is the component after a # symbol. For example, in the URL http://www.example.com/index.html#jumpLocation, the fragment is jumpLocation.

    Availability

    Available in iOS 7.0 and later.

  • Returns the character set for characters allowed in a host URL subcomponent.

    Declaration

    Swift

    class func URLHostAllowedCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)URLHostAllowedCharacterSet

    Discussion

    The host component of a URL is usually the component immediately after the first two leading slashes. If the URL contains a username and password, the host component is the component after the @ sign. For example, in the URL http://username:password@www.example.com/index.html, the host component is www.example.com.

    Availability

    Available in iOS 7.0 and later.

  • Returns the character set for characters allowed in a password URL subcomponent.

    Declaration

    Swift

    class func URLPasswordAllowedCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)URLPasswordAllowedCharacterSet

    Discussion

    The password component of a URL is the component immediately following the colon after the username component of the URL, and ends at the @ sign. For example, in the URL http://username:password@www.example.com/index.html, the pass component is password.

    Availability

    Available in iOS 7.0 and later.

  • Returns the character set for characters allowed in a path URL component.

    Declaration

    Swift

    class func URLPathAllowedCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)URLPathAllowedCharacterSet

    Discussion

    The path component of a URL is the component immediately following the host component (if present). It ends wherever the query or fragment component begins. For example, in the URL http://www.example.com/index.php?key1=value1, the path component is /index.php.

    Availability

    Available in iOS 7.0 and later.

  • Returns the character set for characters allowed in a query URL component.

    Declaration

    Swift

    class func URLQueryAllowedCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)URLQueryAllowedCharacterSet

    Discussion

    The query component of a URL is the component immediately following a question mark (?). For example, in the URL http://www.example.com/index.php?key1=value1#jumpLink, the query component is key1=value1.

    Availability

    Available in iOS 7.0 and later.

  • Returns the character set for characters allowed in a user URL subcomponent.

    Declaration

    Swift

    class func URLUserAllowedCharacterSet() -> NSCharacterSet

    Objective-C

    + (NSCharacterSet *)URLUserAllowedCharacterSet

    Discussion

    The user component of a URL is an optional component that precedes the host component, and ends at either a colon (if a password is specified) or an @ sign (if no password is specified). For example, in the URL http://username:password@www.example.com/index.html, the user component is username.

    Availability

    Available in iOS 7.0 and later.

  • Returns a character set containing the characters in a given string.

    Declaration

    Swift

    init(charactersInString aString: String)

    Objective-C

    + (NSCharacterSet *)characterSetWithCharactersInString:(NSString *)aString

    Parameters

    aString

    A string containing characters for the new character set.

    Return Value

    A character set containing the characters in aString. Returns an empty character set if aString is empty.

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set containing characters with Unicode values in a given range.

    Declaration

    Swift

    init(range aRange: NSRange)

    Objective-C

    + (NSCharacterSet *)characterSetWithRange:(NSRange)aRange

    Parameters

    aRange

    A range of Unicode values.

    aRange.location is the value of the first character to return; aRange.location + aRange.length– 1 is the value of the last.

    Return Value

    A character set containing characters whose Unicode values are given by aRange. If aRange.length is 0, returns an empty character set.

    Discussion

    This code excerpt creates a character set object containing the lowercase English alphabetic characters:

    1. NSRange lcEnglishRange;
    2. NSCharacterSet *lcEnglishLetters;
    3. lcEnglishRange.location = (unsigned int)'a';
    4. lcEnglishRange.length = 26;
    5. lcEnglishLetters = [NSCharacterSet characterSetWithRange:lcEnglishRange];

    Availability

    Available in iOS 2.0 and later.

  • A character set containing only characters that don’t exist in the receiver. (read-only)

    Declaration

    Swift

    @NSCopying var invertedSet: NSCharacterSet { get }

    Objective-C

    @property(readonly, copy) NSCharacterSet *invertedSet

    Discussion

    Using the inverse of an immutable character set is much more efficient than inverting a mutable character set.

    Availability

    Available in iOS 2.0 and later.

    See Also

    invert (NSMutableCharacterSet)

  • Returns a character set containing characters determined by a given bitmap representation.

    Declaration

    Swift

    init(bitmapRepresentation data: NSData)

    Objective-C

    + (NSCharacterSet *)characterSetWithBitmapRepresentation:(NSData *)data

    Parameters

    data

    A bitmap representation of a character set.

    Return Value

    A character set containing characters determined by data.

    Discussion

    This method is useful for creating a character set object with data from a file or other external data source.

    A raw bitmap representation of a character set is a byte array with the first 2^16 bits (that is, 8192 bytes) representing the code point range of the the Basic Multilingual Plane (BMP), such that the value of the bit at position n represents the presence in the character set of the character with decimal Unicode value n. A bitmap representation may contain zero to sixteen additional 8192 byte segments to for each additional Unicode plane containing a character in a character set, with each 8192 byte segment prepended with a single plane index byte.

    To add a character in the Basic Multilingual Plane (BMP) with decimal Unicode value n to a raw bitmap representation, you might do the following:

    1. unsigned char bitmapRep[8192];
    2. bitmapRep[n >> 3] |= (((unsigned int)1) << (n & 7));

    To remove that character:

    1. bitmapRep[n >> 3] &= ~(((unsigned int)1) << (n & 7));

    Availability

    Available in iOS 2.0 and later.

  • Returns a character set read from the bitmap representation stored in the file a given path.

    Declaration

    Swift

    init?(contentsOfFile fName: String)

    Objective-C

    + (NSCharacterSet *)characterSetWithContentsOfFile:(NSString *)path

    Parameters

    path

    A path to a file containing a bitmap representation of a character set. The path name must end with the extension .bitmap.

    Return Value

    A character set read from the bitmap representation stored in the file at path.

    Discussion

    This method doesn’t use filenames to check for the uniqueness of the character sets it creates. To prevent duplication of character sets in memory, cache them and make them available through an API that checks whether the requested set has already been loaded.

    To read a bitmap representation from any file, use the NSData methoddataWithContentsOfFile:options:error: and pass the result to characterSetWithBitmapRepresentation:.

    Availability

    Available in iOS 2.0 and later.

  • An NSData object encoding the receiver in binary format. (read-only)

    Declaration

    Swift

    @NSCopying var bitmapRepresentation: NSData { get }

    Objective-C

    @property(readonly, copy) NSData *bitmapRepresentation

    Discussion

    This format is suitable for saving to a file or otherwise transmitting or archiving.

    A raw bitmap representation of a character set is a byte array with the first 2^16 bits (that is, 8192 bytes) representing the code point range of the the Basic Multilingual Plane (BMP), such that the value of the bit at position n represents the presence in the character set of the character with decimal Unicode value n. A bitmap representation may contain zero to sixteen additional 8192 byte segments to for each additional Unicode plane containing a character in a character set, with each 8192 byte segment prepended with a single plane index byte.

    For example, a character set containing only Basic Latin (ASCII) characters, which are contained by the Basic Multilingual Plane (BMP, plane 0), has a bitmap representation with a size of 8192 bytes, whereas a character set containing both Basic Latin (ASCII) characters and emoji characters, which are contained by the Supplementary Multilingual Plane (SMP, plane 1), has a bitmap representation with a size of 16385 bytes (8192 bytes for BMP, followed by the byte 0x01 for the plane index of SMP, followed by 8192 bytes for SMP).

    To test for the presence of a character in the Basic Multilingual Plane (BMP) with decimal Unicode value n in a raw bitmap representation, you might do the following:

    1. unsigned char bitmapRep[8192];
    2. if (bitmapRep[n >> 3] & (((unsigned int)1) << (n & 7))) {
    3. /* Character is present. */
    4. }

    Availability

    Available in iOS 2.0 and later.

  • Returns a Boolean value that indicates whether a given character is in the receiver.

    Declaration

    Swift

    func characterIsMember(_ aCharacter: unichar) -> Bool

    Objective-C

    - (BOOL)characterIsMember:(unichar)aCharacter

    Parameters

    aCharacter

    The character to test for membership of the receiver.

    Return Value

    YEStrue if aCharacter is in the receiving character set, otherwise NOfalse.

    Availability

    Available in iOS 2.0 and later.

  • Returns a Boolean value that indicates whether the receiver has at least one member in a given character plane.

    Declaration

    Swift

    func hasMemberInPlane(_ thePlane: UInt8) -> Bool

    Objective-C

    - (BOOL)hasMemberInPlane:(uint8_t)thePlane

    Parameters

    thePlane

    A character plane.

    Return Value

    YEStrue if the receiver has at least one member in thePlane, otherwise NOfalse.

    Discussion

    This method makes it easier to find the plane containing the members of the current character set. The Basic Multilingual Plane (BMP) is plane 0.

    Availability

    Available in iOS 2.0 and later.

  • Returns a Boolean value that indicates whether the receiver is a superset of another given character set.

    Declaration

    Swift

    func isSupersetOfSet(_ theOtherSet: NSCharacterSet) -> Bool

    Objective-C

    - (BOOL)isSupersetOfSet:(NSCharacterSet *)theOtherSet

    Parameters

    theOtherSet

    A character set.

    Return Value

    YEStrue if the receiver is a superset of theOtherSet, otherwise NOfalse.

    Availability

    Available in iOS 2.0 and later.

  • Returns a Boolean value that indicates whether a given long character is a member of the receiver.

    Declaration

    Swift

    func longCharacterIsMember(_ theLongChar: UTF32Char) -> Bool

    Objective-C

    - (BOOL)longCharacterIsMember:(UTF32Char)theLongChar

    Parameters

    theLongChar

    A UTF32 character.

    Return Value

    YEStrue if theLongChar is in the receiver, otherwise NOfalse.

    Discussion

    This method supports the specification of 32-bit characters.

    Availability

    Available in iOS 2.0 and later.

  • Specifies lower bound for a Unicode character range reserved for Apple’s corporate use.

    Declaration

    Swift

    var NSOpenStepUnicodeReservedBase: Int { get }

    Objective-C

    enum { NSOpenStepUnicodeReservedBase = 0xF400 };

    Constants

    • NSOpenStepUnicodeReservedBase

      NSOpenStepUnicodeReservedBase

      Specifies lower bound for a Unicode character range reserved for Apple’s corporate use (the range is 0xF400–0xF8FF).

      Available in iOS 2.0 and later.