BriefGPT.xyz
May, 2023
基于奖励函数相似性的选择性模仿
Selective imitation on the basis of reward function similarity
HTML
PDF
Max Taylor-Davies, Stephanie Droop, Christopher G. Lucas
TL;DR
研究了在多个异构智能体追求不同目标或目的的情况下,模仿行为不太可能是一种有效的策略,而人们会更倾向于模仿那些他们认为与自己有相似奖励函数的人的行为,并通过归纳偏差这一方法来进行选择。
Abstract
imitation
is a key component of human social behavior, and is widely used by both children and adults as a way to navigate uncertain or unfamiliar situations. But in an environment populated by multiple
heterogeneous ag
→