BriefGPT.xyz
Jun, 2020
通过最大化Rényi熵进行无奖励强化学习框架探索
Exploration by Maximizing Rényi Entropy for Zero-Shot Meta RL
HTML
PDF
Chuheng Zhang, Yuanying Cai, Longbo Huang, Jian Li
TL;DR
通过最大化Renyi熵的方法,提出了一种适用于元RL的无奖励强化学习框架,该框架有效地解决了探索和利用分离的问题,并设计了相应的强化学习算法(batch RL algorithm)以便在规划阶段中能更好地处理任意奖励函数。
Abstract
Exploring the transition dynamics is essential to the success of
reinforcement learning
(RL) algorithms. To face the challenges of
exploration
, we consider a zero-shot meta RL framework that completely separates
→