A description of a gated recurrent unit block or layer.
- iOS 11.0+
- macOS 10.13+
- tvOS 11.0+
- Mac Catalyst 13.0+Beta
- Metal Performance Shaders
The recurrent neural network (RNN) layer initialized with a
MPSGRUDescriptor transforms the input data (image or matrix) and previous output with a set of filters. Each produces one feature map in the output data according to the gated recurrent unit (GRU) unit formula detailed below.
You may provide the GRU unit with a single input or a sequence of inputs. The layer also supports p-norm gating.
Description of Operation
xbe the input data (at time index
jindex containing quadruplet: batch index,
x,yand feature index (
x = y = 0for matrices)).
h0_j be the recurrent input (previous output) data from previous time step (at time index
hbe the proposed new output.
h1be the output data produced at this time step.
Uzbe the input gate weights for input and recurrent input data, respectively.
bibe the bias for the input gate.
Urbe the recurrent gate weights for input and recurrent input data, respectively.
brbe the bias for the recurrent gate.
Vhbe the output gate weights for input, recurrent gate, and input gate, respectively.
bhbe the bias for the output gate.
gh(x)be the neuron activation function for the input, recurrent, and output gates.
p > 0be a scalar variable (typical
p >= 1) that defines the p-norm gating norm value.
The output of the GRU layer is computed as follows:
Summation is over index
j (except for the batch index), but there's no summation over repeated index
Note that for validity, all intermediate images must be of same size, and all
V matrices must be square (that is,
input). Also, the bias terms are scalars with regard to spatial dimensions. The conventional GRU block is achieved by setting
Vh = 0 (nil), and the Minimal Gated Unit is achieved with
Uh = 0.