BriefGPT.xyz
Jul, 2023
离线强化学习的隐式计划器:以目标条件预测编码为例
Goal-Conditioned Predictive Coding as an Implicit Planner for Offline Reinforcement Learning
HTML
PDF
Zilai Zeng, Ce Zhang, Shijie Wang, Chen Sun
TL;DR
本研究探讨了序列建模在轨迹数据中提取有用表示并对政策学习做出贡献的能力,并引入了Goal-Conditioned Predicitve Coding (GCPC)方法,通过对未来的目标条件潜在表示进行学习实现了竞争力表现。
Abstract
Recent work has demonstrated the effectiveness of formulating decision making as a supervised learning problem on
offline-collected trajectories
. However, the benefits of performing
sequence modeling
on trajector
→