Adaptive gradient descent From the paper: https://arxiv.org/pdf/1910.09529 The update rule for adaptive gradient descent is: $$\lambda_k = \min(\sqrt{1 + \theta_{k-1}} \lambda_{k-1}, \frac{||x_k - x_{k-1}||}{2 ||\nabla f(x_k) - \nabla f(x_{k-1})||} )$$ $$x_{k+1} = x_k - \lambda_k \nabla f(x_k)$$ $$\theta_k = \lambda_k / \lambda_{k-1}$$

Usage

adaptive_gd(stepsize = 0.01)

Arguments

stepsize: initial stepsize for SGD

Value

a list of control variables for optimization (used in control_opt function)