Gradient descent is arguably one of the most popular online optimization
methods with a wide array of applications. However, the standard implementation
where agents simultaneously update their strategies yields several undesirable
properties; strategies diverge away from equilibrium and regret grows over
time. In this paper, we eliminate these negative properties by introducing a
different implementation to obtain finite regret via arbitrary fixed step-size.
We obtain this surprising property by having agents take turns when updating
their strategies. In this setting, we show that an agent that uses gradient
descent obtains bounded regret -- regardless of how their opponent updates
their strategies. Furthermore, we show that in adversarial settings that
agents' strategies are bounded and cycle when both are using the alternating
gradient descent algorithm.

本文介绍了一种通过交替更新策略，使用有限步长实现梯度下降算法的非标准实现方法，从而消除了标准实现方法容易出现的策略偏离均衡和后悔值不断增加的问题，并建议在对抗环境下使用交替梯度下降算法来保证策略的有界性和周期性。