mxnet.gluon.Trainer.step¶
-
Trainer.
step
(batch_size, ignore_stale_grad=False)[source]¶ Makes one step of parameter update. Should be called after autograd.backward() and outside of record() scope.
For normal parameter updates, step() should be used, which internally calls allreduce_grads() and then update(). However, if you need to get the reduced gradients to perform certain transformation, such as in gradient clipping, then you may want to manually call allreduce_grads() and update() separately.
- Parameters
batch_size (int) – Batch size of data processed. Gradient will be normalized by 1/batch_size. Set this to 1 if you normalized loss manually with loss = mean(loss).
ignore_stale_grad (bool, optional, default=False) – If true, ignores Parameters with stale gradient (gradient that has not been updated by backward after last step) and skip update.