Implement and run neural networks, using previously obtained training data.


Basic Neural Network Subroutines (BNNS) is a collection of functions that you use to implement and run neural networks, using previously obtained training data.

Creating a Neural Network for Inference

The Accelerate framework’s BNNS library is a collection of functions that you can use to construct neural networks. It is supported in macOS, iOS, tvOS, and watchOS, and is optimized for all CPUs supported on those platforms.

BNNS supports implementation and operation of neural networks for inference, using input data previously derived from training. BNNS does not do training, however. Its purpose is to provide very high performance inference on already trained neural networks.

This document introduces BNNS in general terms. For further details, consult the header file bnns.h in the Accelerate framework.

Structure of a Neural Network

A neural network is a sequence of layers, each layer performing a filter operation on its input and passing the result as input to the next layer. The output of the last layer is an inference drawn from the initial input: for example, the initial input might be an image and the inference might be that it’s an image of a dinosaur.

A layer consists of a filter plus data derived from training, and an activation function. The filter is designed to use the training data in transforming the input. For example, in a convolution layer the filter would use training data as weights in the convolution kernel.

The activation function is applied to the output data after the filter. You select the activation function from a small set of simple functions given by the BNNSActivationFunction enumeration type in bnns.h. Some of the activation functions accept one or two float parameters which you specify.

BNNS provides functions for creating, applying, and destroying three kinds of layers:

  • A convolution layer, for each pixel in an input image, takes that pixel and its neighboring pixels and combines their values with weights from the training data to compute the corresponding pixel in the output image.

  • A pooling layer produces a smaller output image from its input image by breaking the input image into smaller rectangular subimages; each pixel in the output is the maximum or average (your choice) of the pixels in the corresponding subimage. A pooling layer does not use training data.

  • A fully connected layer takes its input as a vector; this vector is multiplied by a matrix of weights from training data. The resulting vector is updated by the activation function.


Filter Application


Applies a filter to an input, writing out the result to a specified output.


Applies a filter to a set of input objects, writing out the result to a set of output objects.


A structure containing common filter parameters.

Activation Layers


A structure containing layer functions.


A structure containing common activation function parameters.

Convolution Layers


Returns a convolution filter, initialized with input, output, layer, and filter parameters.


A structure containing convolution parameters.

Fully Connected Layers


Returns a fully connected filter, initialized with input, output, layer, and filter parameters.


A structure containing fully connected layer parameters.

Pooling Layers


Returns a pooling filter, initialized with input, output, layer, and filter parameters.


A structure containing pooling layer parameters.


A structure containing pooling functions.

Data Types


Options that define the storage data type.


A structure containing common layer parameters.

Memory Management


Destroys the specified filter, releasing all resources allocated for it.


A type-alias for a user-provided memory allocation function.


A type-alias for a user-provided memory deallocation function.