The stochastic gradient descent (SGD) algorithm has been widely used in
statistical estimation for large-scale data due to its computational and memory
efficiency. While most existing works focus on the convergence of the objective
function or the error of the obtained solution, we inv