BriefGPT.xyz
Oct, 2022
半监督离线强化学习与无动作轨迹
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
HTML
PDF
Qinqing Zheng, Mikael Henaff, Brandon Amos, Aditya Grover
TL;DR
通过开发新的算法流程,利用多种数据来源进行线下强化学习,仅使用10%的数据可以达到与完全有标签的数据集相似的性能,同时进行大规模控制实验,以确定半监督学习应用于RL的最佳实践。
Abstract
Natural agents can effectively learn from
multiple data sources
that differ in size, quality, and types of measurements. We study this heterogeneity in the context of
offline reinforcement learning
(RL) by introd
→