BriefGPT.xyz
Jun, 2020
强化学习中的无任务探索
Task-agnostic Exploration in Reinforcement Learning
HTML
PDF
Xuezhou Zhang, Yuzhe ma, Adish Singla
TL;DR
该研究提出了一种称为任务不可知强化学习 (task-agnostic RL) 的框架,用于解决强化学习中的有效探索挑战,该框架利用样本奖励值和一系列探索轨迹来发现复杂任务的最优策略,并给出了基于样本奖励值的有效算法UCBZero。
Abstract
efficient exploration
is one of the main challenges in
reinforcement learning
(RL). Most existing sample-efficient algorithms assume the existence of a single reward function during exploration. In many practical
→