BriefGPT.xyz
Jun, 2020
AWAC: 利用非在线数据集加速在线强化学习
Accelerating Online Reinforcement Learning with Offline Datasets
HTML
PDF
Ashvin Nair, Murtaza Dalal, Abhishek Gupta, Sergey Levine
TL;DR
本文介绍一种可在实际机器人控制中应用的,将过往数据和在线学习相结合的策略,使用动态规划和策略更新相结合的方法可以有效提高学习效率并使学习时间缩短至实际可接受的范围。
Abstract
reinforcement learning
provides an appealing formalism for learning control policies from experience. However, the classic active formulation of
reinforcement learning
necessitates a lengthy active exploration pr
→