We provide a novel reduction from swap-regret minimization to external-regret
minimization, which improves upon the classical reductions of Blum-Mansour
[BM07] and Stolz-Lugosi [SL05] in that it does not require finiteness of the
space of actions. We show that, whenever there exists a no-external-regret
algorithm for some hypothesis class, there must also exist a no-swap-regret
algorithm for that same class. For the problem of learning with expert advice,
our result implies that it is possible to guarantee that the swap regret is
bounded by {\epsilon} after $\log(N)^{O(1/\epsilon)}$ rounds and with $O(N)$
per iteration complexity, where $N$ is the number of experts, while the
classical reductions of Blum-Mansour and Stolz-Lugosi require $O(N/\epsilon^2)$
rounds and at least $\Omega(N^2)$ per iteration complexity. Our result comes
with an associated lower bound, which -- in contrast to that in [BM07] -- holds
for oblivious and $\ell_1$-constrained adversaries and learners that can employ
distributions over experts, showing that the number of rounds must be
$\tilde\Omega(N/\epsilon^2)$ or exponential in $1/\epsilon$.
Our reduction implies that, if no-regret learning is possible in some game,
then this game must have approximate correlated equilibria, of arbitrarily good
approximation. This strengthens the folklore implication of no-regret learning
that approximate coarse correlated equilibria exist. Importantly, it provides a
sufficient condition for the existence of correlated equilibrium which vastly
extends the requirement that the action set is finite, thus answering a
question left open by [DG22; Ass+23]. Moreover, it answers several outstanding
questions about equilibrium computation and/or learning in games.

我们提供了一种新颖的从交换后悔最小化到外部后悔最小化的约简方法，该方法改进了 Blum-Mansour 和 Stolz-Lugosi 的经典约简，不需要动作空间的有限性。我们的结果表明，只要存在某个假设类的无外部后悔算法，同样必然存在该类别的无交换后悔算法。对于使用专家建议的学习问题，我们的结果表明，在 log (N)^{O (1/ε)} 轮迭代中并且每次迭代的复杂度为 O (N)，可以保证交换后悔受到 ε 的约束，而 Blum-Mansour 和 Stolz-Lugosi 的经典约简则需要 O (N/ε^2) 轮迭代和至少 Ω(N^2) 的复杂度。我们的结果还带有一个相关的下界，与 [BM07] 中的下界相反，该下界适用于具有遗忘性和限制的 κ1 的对手和学习者，以及可以使用专家分布的情况，从而说明轮数必须是 Ω(N/ε^2) 或以指数的方式与 1/ε 成反比。我们的约简意味着，如果在某个游戏中可以进行无后悔学习，那么该游戏必须具有近似的相关均衡，具有任意好的近似程度。这加强了无后悔学习所暗示的粗略相互相关均衡的存在。重要的是，它提供了一种存在相关均衡的充分条件，大大扩展了行动集有限的要求，从而回答了 [DG22; Ass+23] 中未解决的问题。此外，它还回答了关于均衡计算和 / 或游戏学习的几个未解决问题。

从外部到 Swap Regret 2.0：大动作空间的高效减少和无视敌对

From External to Swap Regret 2.0: An Efficient Reduction and Oblivious  Adversary for Large Action Spaces

Yet, their performance degrades in the presence of noisy labels at train
time. Inspired by the setting of learning with expert advice, where
multiplicative weights (MW) updates were recently shown to be robust to
moderate data corruptions in expert advice, we propose to use MW for
reweighting examples during neural networks optimization. We theoretically
establish the convergence of our method when used with gradient descent and
prove its advantage for label noise in 1d cases. We then validate empirically
our findings for the general case by showing that MW improves neural networks
accuracy in the presence of label noise on CIFAR-10, CIFAR-100 and Clothing1M.
We also show the impact of our approach on adversarial robustness.

提出一种使用 MW 重新加权示例的神经网络优化方法，该方法在标签存在噪声的情况下稳健且可提高准确性，同时不会影响其对抗鲁棒性。

用乘法重新加权进行神经网络鲁棒性优化

Multiplicative Reweighting for Robust Neural Network Optimization

In the recent years, a number of parameter-free algorithms have been
developed for online linear optimization over Hilbert spaces and for learning
with expert advice. These algorithms achieve optimal regret bounds that depend
on the unknown competitors, without having to tune the learning rates with
oracle choices.
We present a new intuitive framework to design parameter-free algorithms for
\emph{both} online linear optimization over Hilbert spaces and for learning
with expert advice, based on reductions to betting on outcomes of adversarial
coins. We instantiate it using a betting algorithm based on the
Krichevsky-Trofimov estimator. The resulting algorithms are simple, with no
parameters to be tuned, and they improve or match previous results in terms of
regret guarantee and per-round complexity.

该研究在 Hilbert 空间中，通过预测对手行为的赌博机制构建了一种简单的无需调参数的学习算法，用于在线线性现行优化和专家建议学习，实现了优质的后悔约束和分析复杂度。