Usage
adagrad(stepsize = 0.05, epsilon = 1e-08)
Arguments
- stepsize
stepsize for SGD
- epsilon
epsilon for numerical stability
Value
a list of control variables for optimization
(used in control_opt function)
Details
The update rule for AdaGrad is:
$$v_t = v_{t-1} + g_t^2$$
$$x_{t+1} = x_t - \text{stepsize} * \frac{g_t}{\sqrt{v_t} + \epsilon}$$