BriefGPT.xyz
Jan, 2025
带支持约束的投影隐式Q学习在离线强化学习中的应用
Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning
HTML
PDF
Xinchen Han, Hossam Afifi, Michel Marot
TL;DR
本研究解决了离线强化学习中由超出分布的动作引起的外推误差问题。提出了Proj-IQL算法,通过引入支持约束和矢量投影技术,优化了策略评估和改进过程。实验结果表明,Proj-IQL在D4RL基准测试中表现出色,特别是在复杂的导航领域。
Abstract
Offline
Reinforcement Learning
(RL) faces a critical challenge of extrapolation errors caused by out-of-distribution (OOD) actions.
Implicit Q-Learning
(IQL) algorithm employs expectile regression to achieve in-s
→