BriefGPT.xyz
Jul, 2022
USHER: 无偏采样的回顾经验回放
USHER: Unbiased Sampling for Hindsight Experience Replay
HTML
PDF
Liam Schramm, Yunfu Deng, Edgar Granados, Abdeslam Boularias
TL;DR
提出了一种基于重要性采样的算法来处理稀疏奖励带来的偏差问题,并在高维度随机环境中显示了其有效性。
Abstract
Dealing with
sparse rewards
is a long-standing challenge in
reinforcement learning
(RL).
hindsight experience replay
(HER) addresses this
→