BriefGPT.xyz
Jan, 2018
多智能体竞争性次优演示逆强化学习
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations
HTML
PDF
Xingyu Wang, Diego Klabjan
TL;DR
本文介绍一种新的逆强化学习算法,通过深度神经网络模型近似和零和随机博弈的对抗式训练来寻找纳什均衡和奖励函数,解决了以往基于表格表示无法解决的问题。
Abstract
This paper considers the problem of
inverse reinforcement learning
in
zero-sum stochastic games
when expert demonstrations are known to be not optimal. Compared to previous works that decouple agents in the game
→