BriefGPT.xyz
Feb, 2023
任意经验的模仿:强化学习和模仿学习方法的双重统一
Imitation from Arbitrary Experience: A Dual Unification of Reinforcement and Imitation Learning Methods
HTML
PDF
Harshit Sikchi, Amy Zhang, Scott Niekum
TL;DR
该研究论文旨在通过对强化学习, 凸优化和无偏学习方法进行研究, 提出了一种新方法,即对偶RL方法,可以用于从离线偏置数据中进行无偏学习。
Abstract
It is well known that
reinforcement learning
(RL) can be formulated as a
convex program
with linear constraints. The dual form of this formulation is unconstrained, which we refer to as
→