BriefGPT.xyz
Jun, 2020
在线学习的重复囚徒困境模拟人类行为
Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior
HTML
PDF
Baihan Lin, Djallel Bouneffouf, Guillermo Cecchi
TL;DR
本文研究了在线学习算法在囚徒困境游戏中的行为,探究了多臂老虎机、上下文老虎机和强化学习等算法在这种情景下的能力及其对人类行为的拟合度,并从多智能体竞争和策略动态方面得出了许多结论。
Abstract
Prisoner's Dilemma mainly treat the choice to cooperate or defect as an atomic action. We propose to study online learning algorithm behavior in the
iterated prisoner's dilemma
(IPD) game, where we explored the full spectrum of
→