Sparse iterative methods, in particular first-order methods, are known to be
among the most effective in solving large-scale two-player zero-sum
extensive-form games. The convergence rates of these methods depend heavily on
the properties of the distance-generating function that they are based on. We
investigate the acceleration of first-order methods for solving extensive-form
games through better design of the dilated entropy function---a class of
distance-generating functions related to the domains associated with the
extensive-form games. By introducing a new weighting scheme for the dilated
entropy function, we develop the first distance-generating function for the
strategy spaces of sequential games that has no dependence on the branching
factor of the player. This result improves the convergence rate of several
first-order methods by a factor of $\Omega(b^dd)$, where $b$ is the branching
factor of the player, and $d$ is the depth of the game tree.
Thus far, counterfactual regret minimization methods have been faster in
practice, and more popular, than first-order methods despite their
theoretically inferior convergence rates. Using our new weighting scheme and
practical tuning we show that, for the first time, the excessive gap technique
can be made faster than the fastest counterfactual regret minimization
algorithm, CFR+, in practice.

本文主要研究如何通过改进膨胀熵函数的设计，加速第一阶段方法来解决 extensive-form games 问题，并提出了新的加权方案，实践证明本文方法比 CFR + 算法更快。

针对展开式博弈的平滑理论与实践进展

Theoretical and Practical Advances on Smoothing for Extensive-Form Games

Counterfactual Regret Minimization and variants (e.g. Public Chance Sampling
CFR and Pure CFR) have been known as the best approaches for creating
approximate Nash equilibrium solutions for imperfect information games such as
poker. This paper introduces CFR$^+$, a new algorithm that typically
outperforms the previously known algorithms by an order of magnitude or more in
terms of computation time while also potentially requiring less memory.

本文介绍了 CFR$^+$ 算法，它通常在计算时间上比以前已知算法快一个数量级或更多，同时可能需要更少的内存。该算法可用于不完美信息博弈中，是近似纳什均衡解的最佳方法之一。

使用 CFR + 求解大规模不完全信息博弈

Solving Large Imperfect Information Games Using CFR+

Counterfactual Regret Minimization (CFR) is an efficient no-regret learning
algorithm for decision problems modeled as extensive games. CFR's regret bounds
depend on the requirement of perfect recall: players always remember
information that was revealed to them and the order in which it was revealed.
In games without perfect recall, however, CFR's guarantees do not apply. In
this paper, we present the first regret bound for CFR when applied to a general
class of games with imperfect recall. In addition, we show that CFR applied to
any abstraction belonging to our general class results in a regret bound not
just for the abstract game, but for the full game as well. We verify our theory
and show how imperfect recall can be used to trade a small increase in regret
for a significant reduction in memory in three domains: die-roll poker, phantom
tic-tac-toe, and Bluff.

本文提出无法完全回忆的游戏中，针对使用 CFR 算法的一般类游戏的第一个遗憾上限及其不适用性，同时证明使用 CFR 在任何抽象类游戏中都适用，且在三种情况下证明不完全回忆可用于交换少量遗憾和显著降低内存。