mxnet.gluon.nn.LayerNorm¶
-
class
mxnet.gluon.nn.
LayerNorm
(axis=-1, epsilon=1e-05, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', in_channels=0, prefix=None, params=None)[source]¶ Applies layer normalization to the n-dimensional input array. This operator takes an n-dimensional input array and normalizes the input using the given axis:
\[out = \frac{x - mean[data, axis]}{ \sqrt{Var[data, axis] + \epsilon}} * gamma + beta\]- Parameters
axis (int, default -1) – The axis that should be normalized. This is typically the axis of the channels.
epsilon (float, default 1e-5) – Small float added to variance to avoid dividing by zero.
center (bool, default True) – If True, add offset of beta to normalized tensor. If False, beta is ignored.
scale (bool, default True) – If True, multiply by gamma. If False, gamma is not used.
beta_initializer (str or Initializer, default ‘zeros’) – Initializer for the beta weight.
gamma_initializer (str or Initializer, default ‘ones’) – Initializer for the gamma weight.
in_channels (int, default 0) – Number of channels (feature maps) in input data. If not specified, initialization will be deferred to the first time forward is called and in_channels will be inferred from the shape of input data.
- Inputs:
data: input tensor with arbitrary shape.
- Outputs:
out: output tensor with the same shape as data.
References
Examples
>>> # Input of shape (2, 5) >>> x = mx.nd.array([[1, 2, 3, 4, 5], [1, 1, 2, 2, 2]]) >>> # Layer normalization is calculated with the above formula >>> layer = LayerNorm() >>> layer.initialize(ctx=mx.cpu(0)) >>> layer(x) [[-1.41421 -0.707105 0. 0.707105 1.41421 ] [-1.2247195 -1.2247195 0.81647956 0.81647956 0.81647956]] <NDArray 2x5 @cpu(0)>
-
__init__
(axis=-1, epsilon=1e-05, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', in_channels=0, prefix=None, params=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
([axis, epsilon, center, scale, …])Initialize self.
apply
(fn)Applies
fn
recursively to every child block as well as self.cast
(dtype)Cast this Block to use another data type.
collect_params
([select])Returns a
ParameterDict
containing thisBlock
and all of its children’s Parameters(default), also can returns the selectParameterDict
which match some given regular expressions.export
(path[, epoch])Export HybridBlock to json format that can be loaded by SymbolBlock.imports, mxnet.mod.Module or the C++ interface.
forward
(x, *args)Defines the forward computation.
hybrid_forward
(F, data, gamma, beta)Overrides to construct symbolic graph for this Block.
hybridize
([active])Activates or deactivates
HybridBlock
s recursively.infer_shape
(*args)Infers shape of Parameters from inputs.
infer_type
(*args)Infers data type of Parameters from inputs.
initialize
([init, ctx, verbose, force_reinit])Initializes
Parameter
s of thisBlock
and its children.load_parameters
(filename[, ctx, …])Load parameters from file previously saved by save_parameters.
load_params
(filename[, ctx, allow_missing, …])[Deprecated] Please use load_parameters.
name_scope
()Returns a name space object managing a child
Block
and parameter names.register_child
(block[, name])Registers block as a child of self.
register_forward_hook
(hook)Registers a forward hook on the block.
register_forward_pre_hook
(hook)Registers a forward pre-hook on the block.
save_parameters
(filename)Save parameters to file.
save_params
(filename)[Deprecated] Please use save_parameters.
summary
(*inputs)Print the summary of the model’s output and parameters.
Attributes
name
Name of this
Block
, without ‘_’ in the end.params
Returns this
Block
’s parameter dictionary (does not include its children’s parameters).prefix
Prefix of this
Block
.