任意经验的模仿：强化学习和模仿学习方法的双重统一

Feb, 2023

任意经验的模仿：强化学习和模仿学习方法的双重统一

Imitation from Arbitrary Experience: A Dual Unification of Reinforcement and Imitation Learning Methods

Harshit Sikchi, Amy Zhang, Scott Niekum

TL;DR该研究论文旨在通过对强化学习, 凸优化和无偏学习方法进行研究, 提出了一种新方法，即对偶RL方法，可以用于从离线偏置数据中进行无偏学习。

Abstract

It is well known that reinforcement learning (RL) can be formulated as a convex program with linear constraints. The dual form of this formulation is unconstrained, which we refer to as →