BriefGPT.xyz
May, 2024
离线到在线强化学习中的任务泛化集成后继代表
Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning
HTML
PDF
Changhong Wang, Xudong Yu, Chenjia Bai, Qiaosheng Zhang, Zhen Wang
TL;DR
使用离线数据集构建继任者表示法和集成Q函数的方法,以实现从离线到在线学习的任务泛化和快速适应新任务。
Abstract
In
reinforcement learning
(RL), training a policy from scratch with online experiences can be inefficient because of the difficulties in exploration. Recently,
offline rl
provides a promising solution by giving a
→