Consider $N$ players each with a $d$-dimensional action set. Each of the
players' utility functions includes their reward function and a linear term for
each dimension, with coefficients that are controlled by the manager. We assume
that the game is strongly monotone, so if each player runs gradient descent,
the dynamics converge to a unique Nash equilibrium (NE). The NE is typically
inefficient in terms of global performance. The resulting global performance of
the system can be improved by imposing $K$-dimensional linear constraints on
the NE. We therefore want the manager to pick the controlled coefficients that
impose the desired constraint on the NE. However, this requires knowing the
players' reward functions and their action sets. Obtaining this game structure
information is infeasible in a large-scale network and violates the users'
privacy. To overcome this, we propose a simple algorithm that learns to shift
the NE of the game to meet the linear constraints by adjusting the controlled
coefficients online. Our algorithm only requires the linear constraints
violation as feedback and does not need to know the reward functions or the
action sets. We prove that our algorithm, which is based on two time-scale
stochastic approximation, guarantees convergence with probability 1 to the set
of NE that meet target linear constraints. We then provide a mean square
convergence rate of $O(t^{-1/4})$ for our algorithm. This is the first such
bound for two time-scale stochastic approximation where the slower time-scale
is a fixed point iteration with a non-expansive mapping. We demonstrate how our
scheme can be applied to optimizing a global quadratic cost at NE and load
balancing in resource allocation games. We provide simulations of our algorithm
for these scenarios.

我们提出了一种简单的算法，通过在线调整受控系数来学习将博弈的纳什均衡点转移到符合线性约束，而不需要知道奖励函数或行动集，从而提供具有概率 1 保证的收敛性以满足目标线性约束的纳什均衡集合，并为该算法提供了均方收敛速度为 O (t^{-1/4}) 的界限。我们演示了该算法在全局二次代价优化和资源分配博弈中实现负载平衡的应用场景的模拟结果。

学习控制未知强单调博弈

Learning to Control Unknown Strongly Monotone Games

Learning processes by exploiting restricted domain knowledge is an important
task across a plethora of scientific areas, with more and more hybrid methods
combining data-driven and model-based approaches. However, while such hybrid
methods have been tested in various scientific applications, they have been
mostly tested on dynamical systems, with only limited study about the influence
of each model component on global performance and parameter identification. In
this work, we assess the performance of hybrid modeling against traditional
machine learning methods on standard regression problems. We compare, on both
synthetic and real regression problems, several approaches for training such
hybrid models. We focus on hybrid methods that additively combine a parametric
physical term with a machine learning term and investigate model-agnostic
training procedures. We also introduce a new hybrid approach based on partial
dependence functions. Experiments are carried out with different types of
machine learning models, including tree-based models and artificial neural
networks.

本研究对比了传统机器学习方法和基于混合建模的方法在标准回归问题中的性能，并重点研究了混合模型的不同训练方法，结果显示混合建模方法应用于回归问题具有较高的性能表现。