BriefGPT.xyz
Oct, 2022
LECO: 用于任务特定内在奖励的可学习分集计数
LECO: Learnable Episodic Count for Task-Specific Intrinsic Reward
HTML
PDF
Daejin Jo, Sungwoong Kim, Daniel Wontae Nam, Taehwan Kwon, Seungeun Rho...
TL;DR
本文提出了一种可学习的哈希式时间记数方法LECO,它通过使用向量量化变分自编码器和任务特定调制器解决了任务无关的干扰和状态压缩问题,成功地在复杂的场景中实现了强化学习中的探索和利用的平衡。
Abstract
episodic count
has been widely used to design a simple yet effective
intrinsic motivation
for
reinforcement learning
with a sparse reward.
→