BriefGPT.xyz
Oct, 2022
利用预训练技能来拓展目标勘探,用于稀疏奖励长时间尺度的目标条件加强学习
Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement Learning
HTML
PDF
Lisheng Wu, Ke Chen
TL;DR
本文提出了一种新的学习目标,通过优化已实现和未来需要探索的目标的熵,以更高效地探索子目标选择基于GCRL,该方法可以显著提高现有技术的探索效率并改善或保持它们的表现。
Abstract
reinforcement learning
(RL) often struggles to accomplish a sparse-reward long-horizon task in a complex environment. Goal-conditioned
reinforcement learning
(GCRL) has been employed to tackle this difficult prob
→