Class

NLEmbedding

A map of strings to vectors, which locates neighboring, similar strings.

Declaration

@interface NLEmbedding : NSObject

Overview

Use an NLEmbedding to find similar strings based on the proximity of their vectors. The vocabulary is the entire set of strings in an embedding. Each string in the vocabulary has a vector, which is an array of doubles, and each double corresponds to a dimension in the embedding. An NLEmbedding uses these vectors to determine the distance between two strings, or to find the nearest neighbors of a string in the vocabulary. The higher the similarity of any two strings, the smaller the distance is between them.

Natural Language provides built-in word embeddings that you can retrieve by using the wordEmbeddingForLanguage: method. You can also compile your own custom embedding into an efficient, searchable, on-disk representation. Typically, you compile an embedding by using Create ML’s MLWordEmbedding and save it as a file for your Xcode project at development time. Alternatively, you can compile an embedding at runtime by using Natural Language’s write(_:language:revision:to:) method.

Your custom embedding can use any kind of string that’s useful to your app, such as phrases, brand names, serial numbers, and so on. For example, you could make an embedding of movie titles. Each movie title could have a vector that places similar movies close together in the embedding.

Topics

Creating a Word Embedding

+ wordEmbeddingForLanguage:

Retrieves a word embedding for a given language.

+ wordEmbeddingForLanguage:revision:

Retrieves a word embedding for a given language and revision.

+ embeddingWithContentsOfURL:error:

Creates a word embedding from a model file.

Finding Strings in an Embedding

NLDistance

The distance between two strings in a text embedding.

- distanceBetweenString:andString:distanceType:

Calculates the distance between two strings.

- neighborsForString:maximumCount:distanceType:

Retrieves the nearest neighbors of a string, given a number of neighbors and a distance type.

- neighborsForString:maximumCount:maximumDistance:distanceType:

Retrieves the nearest neighbors of a string, given a distance, a number of neighbors, and a distance type.

- neighborsForVector:maximumCount:distanceType:

Retrieves the nearest strings to a vector, given a number of neighbors and a distance type.

- neighborsForVector:maximumCount:maximumDistance:distanceType:

Retrieves the nearest strings to a vector, given a distance, a number of neighbors, and a distance type.

- enumerateNeighborsForString:maximumCount:distanceType:usingBlock:

Enumerates the nearest neighbors of a string, given a closure, a number of neighbors, and a distance type.

- enumerateNeighborsForString:maximumCount:maximumDistance:distanceType:usingBlock:

Enumerates the nearest neighbors of a string, given a closure, a distance, a number of neighbors, and a distance type.

- enumerateNeighborsForVector:maximumCount:distanceType:usingBlock:

Enumerates the nearest strings of a vector, given a closure, a number of neighbors, and a distance type.

- enumerateNeighborsForVector:maximumCount:maximumDistance:distanceType:usingBlock:

Enumerates the nearest strings of a vector, given a closure, a distance, a number of neighbors, and a distance type.

Inspecting the Vocabulary of an Embedding

dimension

The number of dimensions in the vocabulary’s vector space.

vocabularySize

The number of words in the vocabulary.

language

The language of the text in the word embedding.

- containsString:

Requests a Boolean value that indicates whether the term is in the vocabulary.

- vectorForString:

Requests the vector for the given term.

- getVector:forString:

Copies a vector into the given a pointer to a float array.

revision

The revision of the word embedding.

Saving an Embedding

+ writeEmbeddingForDictionary:language:revision:toURL:error:

Exports the word embedding contained within a Core ML model file at the given URL.

Checking for Natural Language Support

+ currentRevisionForLanguage:

Retrieves the current version of a word embedding for the given language.

+ supportedRevisionsForLanguage:

Retrieves all version numbers of a word embedding for the given language.

Relationships

Inherits From