A description of a gated recurrent unit block or layer.


class MPSGRUDescriptor : MPSRNNDescriptor


The recurrent neural network (RNN) layer initialized with a MPSGRUDescriptor transforms the input data (image or matrix) and previous output with a set of filters. Each produces one feature map in the output data according to the gated recurrent unit (GRU) unit formula detailed below.

You may provide the GRU unit with a single input or a sequence of inputs. The layer also supports p-norm gating.

Description of Operation

  1. Let x_j be the input data (at time index t of sequence, j index containing quadruplet: batch index, x,y and feature index (x = y = 0 for matrices)).

  2. Let h0_j be the recurrent input (previous output) data from previous time step (at time index t-1 of sequence).

  3. Let h_i be the proposed new output.

  4. Let h1_i be the output data produced at this time step.

  5. Let Wz_ij, Uz_ij be the input gate weights for input and recurrent input data, respectively.

  6. Let bi_i be the bias for the input gate.

  7. Let Wr_ij, Ur_ij be the recurrent gate weights for input and recurrent input data, respectively.

  8. Let br_i be the bias for the recurrent gate.

  9. Let Wh_ij, Uh_ij, Vh_ij be the output gate weights for input, recurrent gate, and input gate, respectively.

  10. Let bh_i be the bias for the output gate.

  11. Let gz(x), gr(x), gh(x) be the neuron activation function for the input, recurrent, and output gates.

  12. Let p > 0 be a scalar variable (typical p >= 1.0) that defines the p-norm gating norm value.

The output of the GRU layer is computed as follows:

z_i = gz(  Wz_ij * x_j  +  Uz_ij * h0_j  +  bz_i  )
r_i = gr(  Wr_ij * x_j  +  Ur_ij * h0_j  +  br_i  )
c_i =      Uh_ij * (r_j h0_j)  +  Vh_ij * (z_j h0_j)
h_i = gh(  Wh_ij * x_j  + c_i + bh_i  )

h1_i = ( 1 - z_i ^ p)^(1/p) h0_i + z_i h_i

The * stands for convolution (see MPSRNNImageInferenceLayer) or matrix-vector/matrix multiplication (see MPSRNNMatrixInferenceLayer).

Summation is over index j (except for the batch index), but there's no summation over repeated index i, the output index.

Note that for validity, all intermediate images must be of same size, and all U and V matrices must be square (that is, outputFeatureChannels == inputFeatureChannels). Also, the bias terms are scalars with regard to spatial dimensions. The conventional GRU block is achieved by setting Vh = 0 (nil), and the Minimal Gated Unit is achieved with Uh = 0.


Inherits From

Conforms To

See Also

Recurrent Neural Networks

class MPSRNNImageInferenceLayer

A recurrent neural network layer for inference on Metal Performance Shaders images.

class MPSRNNMatrixInferenceLayer

A recurrent neural network layer for inference on Metal Performance Shaders matrices.

class MPSRNNSingleGateDescriptor

A description of a simple recurrent block or layer.

class MPSLSTMDescriptor

A description of a long short-term memory block or layer.

enum MPSRNNSequenceDirection

Directions that a sequence of inputs can be processed by a recurrent neural network layer.

class MPSRNNMatrixTrainingLayer

A layer for training recurrent neural networks on Metal Performance Shaders matrices.

class MPSRNNMatrixTrainingState

A class that holds data from a forward pass to be used in a backward pass.

Beta Software

This documentation contains preliminary information about an API or technology in development. This information is subject to change, and software implemented according to this documentation should be tested with final operating system software.

Learn more about using Apple's beta software