BriefGPT.xyz
Sep, 2019
多智能体自学课程中的紧急工具使用
Emergent Tool Use From Multi-Agent Autocurricula
HTML
PDF
Bowen Baker, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell...
TL;DR
通过多智能体竞争、自我监督的自动课程设置以及规模化的强化学习算法,我们发现代理创建了多个不同的新兴策略,其中许多需要复杂的工具使用和协调,并提供了有关多智能竞争可能扩展至更复杂环境的证据。
Abstract
Through
multi-agent competition
, the simple objective of hide-and-seek, and standard
reinforcement learning
algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple disti
→