Delays and asynchrony are inevitable in large-scale machine-learning problems
where communication plays a key role. As such, several works have extensively
analyzed stochastic optimization with delayed gradients. However, as far as we
are aware, no analogous theory is available for min-max op