Traditional Reinforcement Learning (RL) suffers from replicating human-like behaviors, generalizing effectively in multi-agent scenarios, and overcoming inherent interpretability issues.These tasks are compounded when deep environment understanding, agent coordination and dynamic optimization are required. While Large Language Model (LLM) enhanced methods have shown promise in generalization and interoperability, they often neglect necessary multi-agent coordination. Therefore, we introduce the Cascading Cooperative Multi-agent (CCMA) framework, integrating RL for individual interactions, a fine-tuned LLM for regional cooperation, a reward function for global optimization, and the Retrieval-augmented Generation mechanism to dynamically optimize decision-making across complex driving scenarios. Our experiments demonstrate that the CCMA outperforms existing RL methods, demonstrating significant improvements in both micro and macro-level performance in complex driving environments.

本研究解决了传统强化学习在多智能体场景中复制人类行为、有效泛化以及解释性问题的挑战，尤其是在深度环境理解、智能体协调和动态优化的需求下。提出了级联协作多智能体（CCMA）框架，结合个体交互的强化学习、经过微调的大语言模型的区域合作、全局优化的奖励函数以及动态优化决策的检索增强生成机制。实验结果表明，CCMA在复杂驾驶环境中显著超越了现有的强化学习方法，实现了微观和宏观性能上的重大提升。

基于大语言模型的车道合并控制的级联协作多智能体框架