BriefGPT.xyz
Jul, 2021
在强化学习中学习利他行为, 不依赖于外部奖励
Learning Altruistic Behaviours in Reinforcement Learning without External Rewards
HTML
PDF
Tim Franzmeyer, Mateusz Malinowski, João F. Henriques
TL;DR
提出一种无需外部监督从而学习利他行为的人工智能代理方法,该方法基于强化学习,通过给予其他代理更多的选择和帮助其达成更多状态来实现利他行为,能够在多种协作环境中表现出色。
Abstract
Can
artificial agents
learn to assist others in achieving their goals without knowing what those goals are? Generic
reinforcement learning
agents could be trained to behave altruistically towards others by reward
→