Many multi-agent systems in practice are decentralized and have dynamically varying dependencies. There has been a lack of attempts in the literature to analyze these systems theoretically. In this paper, we propose and theoretically analyze a decentralized model with dynamically varying dependencies called the Locally Interdependent Multi-Agent MDP. This model can represent problems in many disparate domains such as cooperative navigation, obstacle avoidance, and formation control. Despite the intractability that general partially observable multi-agent systems suffer from, we propose three closed-form policies that are theoretically near-optimal in this setting and can be scalable to compute and store. Consequentially, we reveal a fundamental property of Locally Interdependent Multi-Agent MDP's that the partially observable decentralized solution is exponentially close to the fully observable solution with respect to the visibility radius. We then discuss extensions of our closed-form policies to further improve tractability. We conclude by providing simulations to investigate some long horizon behaviors of our closed-form policies.

我们提出并从理论上分析了一种名称为局部相互依赖的多智能体马尔可夫决策过程的分散模型，该模型可以代表协作导航、避障和形成控制等许多不同领域的问题。尽管普遍的部分可观测多智能体系统很难处理，但我们提出了三种闭合形式的策略，在这种情况下理论上是近似最优的，并且可以扩展到可计算和存储。因此，我们揭示了局部相互依赖的多智能体马尔可夫决策过程的一个基本特性，即部分可观测的分散解决方案与可见半径相对于完全可观测解决方案指数级接近。然后，我们讨论了将我们的闭合形式策略扩展以进一步改善可处理性的方法。最后，我们提供了模拟实验来研究我们的闭合形式策略在长期情景下的一些行为。

本地相互依赖的多智能体MDP：分散智能体与动态依赖的理论框架