We present the development and analysis of a reinforcement learning (RL)
algorithm designed to solve continuous-space mean field game (MFG) and mean
field control (MFC) problems in a unified manner. The proposed approach pairs
the actor-critic (AC) paradigm with a representation of the mean field
distribution via a parameterized score function, which can be efficiently
updated in an online fashion, and uses Langevin dynamics to obtain samples from
the resulting distribution. The AC agent and the score function are updated
iteratively to converge, either to the MFG equilibrium or the MFC optimum for a
given mean field problem, depending on the choice of learning rates. A
straightforward modification of the algorithm allows us to solve mixed mean
field control games (MFCGs). The performance of our algorithm is evaluated
using linear-quadratic benchmarks in the asymptotic infinite horizon framework.

我们提出了一种强化学习算法，用于以统一的方式解决连续空间均场博弈和均场控制问题。该算法使用参考分数函数和 Langevin 动力学来表示均场分布，通过在线方式高效地更新，并通过迭代更新，收敛于给定均场问题的均衡点或最优点。该算法可以简单修改以解决混合均场控制博弈，并在渐进无限时域框架中使用线性二次基准函数进行性能评估。