BriefGPT.xyz
Feb, 2024
逆强化学习中悲观主义的优点
The Virtues of Pessimism in Inverse Reinforcement Learning
HTML
PDF
David Wu, Gokul Swamy, J. Andrew Bagnell, Zhiwei Steven Wu, Sanjiban Choudhury
TL;DR
通过使用离线RL算法作为IRL过程的一部分,我们能够更有效地找到与专家表现相匹配的策略。
Abstract
inverse
reinforcement learning
(
irl
) is a powerful framework for learning complex behaviors from expert demonstrations. However, it tradit
→