Momentum SGD optimization
Usage
momentum(stepsize = 0.05, beta1 = 0.9, beta2 = 1 - beta1)
Arguments
- stepsize
stepsize for SGD
- beta1
beta1 for momentum
- beta2
beta2 for momentum
Value
a list of control variables for optimization
(used in control_opt function)
Details
The update rule for momentum is:
$$v_t = \beta_1 v_{t-1} + \beta_2 g_t$$
$$x_{t+1} = x_t - \text{stepsize} * v_t$$