mxnet.gluon.rnn.GRU¶

class mxnet.gluon.rnn.GRU(hidden_size, num_layers=1, layout='TNC', dropout=0, bidirectional=False, input_size=0, i2h_weight_initializer=None, h2h_weight_initializer=None, i2h_bias_initializer='zeros', h2h_bias_initializer='zeros', **kwargs)[source]¶

Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence. Note: this is an implementation of the cuDNN version of GRUs (slight modification compared to Cho et al. 2014; the reset gate \(r_t\) is applied after matrix multiplication).

For each element in the input sequence, each layer computes the following function:

\[\begin{split}\begin{array}{ll} r_t = sigmoid(W_{ir} x_t + b_{ir} + W_{hr} h_{(t-1)} + b_{hr}) \\ i_t = sigmoid(W_{ii} x_t + b_{ii} + W_{hi} h_{(t-1)} + b_{hi}) \\ n_t = \tanh(W_{in} x_t + b_{in} + r_t * (W_{hn} h_{(t-1)} + b_{hn})) \\ h_t = (1 - i_t) * n_t + i_t * h_{(t-1)} \\ \end{array}\end{split}\]

where \(h_t\) is the hidden state at time t, \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer, and \(r_t\), \(i_t\), \(n_t\) are the reset, input, and new gates, respectively.

Parameters

hidden_size (int) – The number of features in the hidden state h
num_layers (int, default 1) – Number of recurrent layers.
layout (str, default 'TNC') – The format of input and output tensors. T, N and C stand for sequence length, batch size, and feature dimensions respectively.
dropout (float, default 0) – If non-zero, introduces a dropout layer on the outputs of each RNN layer except the last layer
bidirectional (bool, default False) – If True, becomes a bidirectional RNN.
i2h_weight_initializer (str or Initializer) – Initializer for the input weights matrix, used for the linear transformation of the inputs.
h2h_weight_initializer (str or Initializer) – Initializer for the recurrent weights matrix, used for the linear transformation of the recurrent state.
i2h_bias_initializer (str or Initializer) – Initializer for the bias vector.
h2h_bias_initializer (str or Initializer) – Initializer for the bias vector.
input_size (int, default 0) – The number of expected features in the input x. If not specified, it will be inferred from input.
prefix (str or None) – Prefix of this Block.
params (ParameterDict or None) – Shared Parameters for this Block.

Inputs:

data: input tensor with shape (sequence_length, batch_size, input_size) when layout is “TNC”. For other layouts, dimensions are permuted accordingly using transpose() operator which adds performance overhead. Consider creating batches in TNC layout during data batching step.
states: initial recurrent state tensor with shape (num_layers, batch_size, num_hidden). If bidirectional is True, shape will instead be (2*num_layers, batch_size, num_hidden). If states is None, zeros will be used as default begin states.

Outputs:

out: output tensor with shape (sequence_length, batch_size, num_hidden) when layout is “TNC”. If bidirectional is True, output shape will instead be (sequence_length, batch_size, 2*num_hidden)
out_states: output recurrent state tensor with the same shape as states. If states is None out_states will not be returned.

Examples

>>> layer = mx.gluon.rnn.GRU(100, 3)
>>> layer.initialize()
>>> input = mx.nd.random.uniform(shape=(5, 3, 10))
>>> # by default zeros are used as begin state
>>> output = layer(input)
>>> # manually specify begin state.
>>> h0 = mx.nd.random.uniform(shape=(3, 3, 100))
>>> output, hn = layer(input, h0)

__init__(hidden_size, num_layers=1, layout='TNC', dropout=0, bidirectional=False, input_size=0, i2h_weight_initializer=None, h2h_weight_initializer=None, i2h_bias_initializer='zeros', h2h_bias_initializer='zeros', **kwargs)[source]¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`__init__`(hidden_size[, num_layers, layout, …])	Initialize self.
`apply`(fn)	Applies `fn` recursively to every child block as well as self.
`begin_state`([batch_size, func])	Initial state for this cell.
`cast`(dtype)	Cast this Block to use another data type.
`collect_params`([select])	Returns a `ParameterDict` containing this `Block` and all of its children’s Parameters(default), also can returns the select `ParameterDict` which match some given regular expressions.
`export`(path[, epoch])	Export HybridBlock to json format that can be loaded by SymbolBlock.imports, mxnet.mod.Module or the C++ interface.
`forward`(x, *args)	Defines the forward computation.
`hybrid_forward`(F, inputs[, states])	Overrides to construct symbolic graph for this Block.
`hybridize`([active])	Activates or deactivates `HybridBlock` s recursively.
`infer_shape`(*args)	Infers shape of Parameters from inputs.
`infer_type`(*args)	Infers data type of Parameters from inputs.
`initialize`([init, ctx, verbose, force_reinit])	Initializes `Parameter` s of this `Block` and its children.
`load_parameters`(filename[, ctx, …])	Load parameters from file previously saved by save_parameters.
`load_params`(filename[, ctx, allow_missing, …])	[Deprecated] Please use load_parameters.
`name_scope`()	Returns a name space object managing a child `Block` and parameter names.
`register_child`(block[, name])	Registers block as a child of self.
`register_forward_hook`(hook)	Registers a forward hook on the block.
`register_forward_pre_hook`(hook)	Registers a forward pre-hook on the block.
`save_parameters`(filename)	Save parameters to file.
`save_params`(filename)	[Deprecated] Please use save_parameters.
`state_info`([batch_size])
`summary`(*inputs)	Print the summary of the model’s output and parameters.

Attributes

`name`	Name of this `Block`, without ‘_’ in the end.
`params`	Returns this `Block`’s parameter dictionary (does not include its children’s parameters).
`prefix`	Prefix of this `Block`.