TL;DR模型辅助的离线强化学习方法在现有理论框架下存在无法解决的边界问题,因此提出了针对此问题的效果显著的新方法 Reach-Aware Value Learning (RAVL)。
Abstract
offline reinforcement learning aims to enable agents to be trained from
pre-collected datasets, however, this comes with the added challenge of
estimating the value of behavior not covered in the dataset. Model-based
methods offer a solution by allowing agents to collect additional syn