AdaGrad SGD optimization — adagrad • ngme2

AdaGrad SGD optimization

Usage

adagrad(stepsize = 0.05, epsilon = 1e-08)

Arguments

stepsize: stepsize for SGD
epsilon: epsilon for numerical stability

Value

a list of control variables for optimization (used in control_opt function)

Details

The update rule for AdaGrad is: $$v_t = v_{t-1} + g_t^2$$ $$x_{t+1} = x_t - \text{stepsize} * \frac{g_t}{\sqrt{v_t} + \epsilon}$$