BriefGPT.xyz
Jul, 2020
基于梯度学习器的逆强化学习
Inverse Reinforcement Learning from a Gradient-based Learner
HTML
PDF
Giorgia Ramponi, Gianluca Drappo, Marcello Restelli
TL;DR
本文提出了用于从机器人的多次策略中恢复策略目标的新算法。该算法基于观察所观察到的代理程序沿梯度方向更新其策略参数的假设。
Abstract
inverse reinforcement learning
addresses the problem of inferring an expert's
reward function
from demonstrations. However, in many applications, we not only have access to the expert's near-optimal behavior, but
→