BriefGPT.xyz
Feb, 2020
强化学习的无奖励探索
Reward-Free Exploration for Reinforcement Learning
HTML
PDF
Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu
TL;DR
该论文提出了一个新的“无奖励强化学习”框架,通过在探索阶段从 MDP采集轨迹来找到探索策略,并使用黑盒近似规划器计算接近最优的策略。
Abstract
exploration
is widely regarded as one of the most challenging aspects of
reinforcement learning
(RL), with many naive approaches succumbing to exponential sample complexity. To isolate the challenges of
→