This paper considers the stability of online learning algorithms and its implications for learnability (bounded regret). We introduce a novel quantity called {\em forward regret} that intuitively measures how good an online learning algorithm is if it is allowed a one-step look-ahead into the future. We show that given stability, bounded forward regret is equivalent to bounded regret. We also show that the existence of an algorithm with bounded regret implies the existence of a stable algorithm with bounded regret and bounded forward regret. The equivalence results apply to general, possibly non-convex problems. To the best of our knowledge, our analysis provides the first general connection between stability and regret in the online setting that is not restricted to a particular class of algorithms. Our stability-regret connection provides a simple recipe for analyzing regret incurred by any online learning algorithm. Using our framework, we analyze several existing online learning algorithms as well as the "approximate" versions of algorithms like RDA that solve an optimization problem at each iteration. Our proofs are simpler than existing analysis for the respective algorithms, show a clear trade-off between stability and forward regret, and provide tighter regret bounds in some cases. Furthermore, using our recipe, we analyze "approximate" versions of several algorithms such as follow-the-regularized-leader (FTRL) that requires solving an optimization problem at each step.

本文研究在线学习算法的稳定性及其对可学性（有限后悔）的影响，提出了一种称为“前向后悔”的新指标，用于测量在线学习算法的预测性能，证明了对于在线优化问题，稳定性等价于后悔有界，且有界前向后悔等价于有界后悔，在分析现有算法的可学性方面提供了一个简单的方法。

在线学习中稳定性与后悔之间的相互作用