BriefGPT.xyz
Oct, 2023
从文本到战术:评估玩阿瓦隆游戏的LLMs
From Text to Tactic: Evaluating LLMs Playing the Game of Avalon
HTML
PDF
Jonathan Light, Min Cai, Sheng Shen, Ziniu Hu
TL;DR
研究论文中探索了大型语言模型在社交推理游戏《反抗之巅》中的潜力,并介绍了AvalonBench测试环境,用于评估多代理模型的决策和语言处理能力。实验结果显示了模型在该游戏环境中存在的性能差距,进而提出了开发更先进的自我学习模型和代理框架来模拟这类复杂游戏环境的设想。
Abstract
In this paper, we explore the potential of
large language models
(LLMs) Agents in playing the strategic
social deduction game
,
resistance avalon<
→