BriefGPT.xyz
Apr, 2023
DEIR:基于判别模型的情节内在奖励的高效稳健探索
DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards
HTML
PDF
Shanchuan Wan, Yujin Tang, Yingtao Tian, Tomoyuki Kaneko
TL;DR
本论文提出一种基于条件互信息的探索奖励方法(DEIR),实现了从代理探索中产生的新颖性的累积学习。在ProcGen数据集的实验中表现出快速学习和良好的泛化性能。
Abstract
exploration
is a fundamental aspect of
reinforcement learning
(RL), and its effectiveness crucially decides the performance of RL algorithms, especially when facing sparse extrinsic rewards. Recent studies showed
→