We consider the distributionally robust optimization (DRO) problem with spectral risk-based uncertainty set and $f$-divergence penalty. This formulation includes common risk-sensitive learning objectives such as regularized condition value-at-risk (CVaR) and average top-$k$ loss. We present Prospect, a stochastic gradient-based algorithm that only requires tuning a single learning rate hyperparameter, and prove that it enjoys linear convergence for smooth regularized losses. This contrasts with previous algorithms that either require tuning multiple hyperparameters or potentially fail to converge due to biased gradient estimates or inadequate regularization. Empirically, we show that Prospect can converge 2-3$\times$ faster than baselines such as stochastic gradient and stochastic saddle-point methods on distribution shift and fairness benchmarks spanning tabular, vision, and language domains.

使用分布稳健优化（DRO）问题中的谱风险不确定性集和$f$-散度惩罚，我们构建了一个包括常见风险敏感学习目标的模型。我们提出了Prospect算法，只需要调整一个学习率超参数，证明其对于平滑正则化损失具有线性收敛性。与先前的算法相比，前者要求调整多个超参数或由于有偏梯度估计或不充分的正则化而可能无法收敛。在实证上，我们展示了在跨表格、视觉和语言领域的分布偏移和公平性基准上，Prospect算法的收敛速度可以比随机梯度和随机鞍点方法快2-3倍。

具有偏差和方差减少的分布鲁棒优化