variance reduction methods such as SVRG and spiderboost use a mixture of
large and small batch gradients to reduce the variance of stochastic gradients.
Compared to SGD, these methods require at least double the
CheapSVRG is proposed as a new stochastic variance-reduction optimization scheme which achieves a linear convergence rate through a surrogate computation while also balancing computational complexity.