Class

NSCharacter​Set

An NSCharacter​Set object represents a set of Unicode-compliant characters. NSString and NSScanner objects use NSCharacter​Set objects to group characters together for searching operations, so that they can find any of a particular set of characters during a search. The cluster’s two public classes, NSCharacter​Set and NSMutable​Character​Set, declare the programmatic interface for static and dynamic character sets, respectively.

Overview

The objects you create using these classes are referred to as character set objects (and when no confusion will result, merely as character sets). Because of the nature of class clusters, character set objects aren’t actual instances of the NSCharacter​Set or NSMutable​Character​Set classes but of one of their private subclasses. Although a character set object’s class is private, its interface is public, as declared by these abstract superclasses, NSCharacter​Set and NSMutable​Character​Set. The character set classes adopt the NSCopying and NSMutable​Copying protocols, making it convenient to convert a character set of one type to the other.

The NSCharacter​Set class declares the programmatic interface for an object that manages a set of Unicode characters (see the NSString class cluster specification for information on Unicode). NSCharacter​Set’s principal primitive method, character​Is​Member(_:​), provides the basis for all other instance methods in its interface. A subclass of NSCharacter​Set needs only to implement this method, plus mutable​Copy(with:​), for proper behavior. For optimal performance, a subclass should also override bitmap​Representation, which otherwise works by invoking character​Is​Member(_:​) for every possible Unicode value.

NSCharacter​Set is “toll-free bridged” with its Core Foundation counterpart, CFCharacter​Set. See Toll-Free Bridging for more information on toll-free bridging.

Symbols

Creating a Standard Character Set

class var alphanumerics:​ Character​Set

Returns a character set containing the characters in Unicode General Categories L*, M*, and N*.

class var capitalized​Letters:​ Character​Set

Returns a character set containing the characters in Unicode General Category Lt.

class var control​Characters:​ Character​Set

Returns a character set containing the characters in Unicode General Category Cc and Cf.

class var decimal​Digits:​ Character​Set

Returns a character set containing the characters in the category of Decimal Numbers.

class var decomposables:​ Character​Set

Returns a character set containing individual Unicode characters that can also be represented as composed character sequences (such as for letters with accents), by the definition of “standard decomposition” in version 3.2 of the Unicode character encoding standard.

class var illegal​Characters:​ Character​Set

Returns a character set containing values in the category of Non-Characters or that have not yet been defined in version 3.2 of the Unicode standard.

class var letters:​ Character​Set

Returns a character set containing the characters in Unicode General Category L* & M*.

class var lowercase​Letters:​ Character​Set

Returns a character set containing the characters in Unicode General Category Ll.

class var newlines:​ Character​Set

Returns a character set containing the newline characters (U+000A ~ U+000D, U+0085, U+2028, and U+2029)

class var non​Base​Characters:​ Character​Set

Returns a character set containing the characters in Unicode General Category M*.

class var punctuation​Characters:​ Character​Set

Returns a character set containing the characters in Unicode General Category P*.

class var symbols:​ Character​Set

Returns a character set containing the characters in Unicode General Category S*.

class var uppercase​Letters:​ Character​Set

Returns a character set containing the characters in Unicode General Category Lu and Lt.

class var whitespaces​And​Newlines:​ Character​Set

Returns a character set containing characters in Unicode General Category Z*, U+000A ~ U+000D, and U+0085.

class var whitespaces:​ Character​Set

Returns a character set containing the characters in Unicode General Category Zs and CHARACTER TABULATION (U+0009).

Creating a Character Set for URL Encoding

class var url​Fragment​Allowed:​ Character​Set

Returns the character set for characters allowed in a fragment URL component.

class var url​Host​Allowed:​ Character​Set

Returns the character set for characters allowed in a host URL subcomponent.

class var url​Password​Allowed:​ Character​Set

Returns the character set for characters allowed in a password URL subcomponent.

class var url​Path​Allowed:​ Character​Set

Returns the character set for characters allowed in a path URL component.

class var url​Query​Allowed:​ Character​Set

Returns the character set for characters allowed in a query URL component.

class var url​User​Allowed:​ Character​Set

Returns the character set for characters allowed in a user URL subcomponent.

Creating a Custom Character Set

init(characters​In:​ String)

Returns a character set containing the characters in a given string.

init(range:​ NSRange)

Returns a character set containing characters with Unicode values in a given range.

var inverted:​ Character​Set

A character set containing only characters that don’t exist in the receiver.

Creating and Managing Character Sets as Bitmap Representations

init(bitmap​Representation:​ Data)

Returns a character set containing characters determined by a given bitmap representation.

init?(contents​Of​File:​ String)

Returns a character set read from the bitmap representation stored in the file a given path.

var bitmap​Representation:​ Data

An NSData object encoding the receiver in binary format.

Testing Set Membership

func character​Is​Member(unichar)

Returns a Boolean value that indicates whether a given character is in the receiver.

func has​Member​In​Plane(UInt8)

Returns a Boolean value that indicates whether the receiver has at least one member in a given character plane.

func is​Superset(of:​ Character​Set)

Returns a Boolean value that indicates whether the receiver is a superset of another given character set.

func long​Character​Is​Member(UTF32Char)

Returns a Boolean value that indicates whether a given long character is a member of the receiver.

Constants

NSOpen​Step​Unicode​Reserved​Base

Specifies lower bound for a Unicode character range reserved for Apple’s corporate use.