BriefGPT.xyz
Jan, 2023
通过不确定性和时间距离感知的课程目标生成的以结果为导向的强化学习
Outcome-directed Reinforcement Learning by Uncertainty & Temporal Distance-Aware Curriculum Goal Generation
HTML
PDF
Daesol Cho, Seungjae Lee, H. Jin Kim
TL;DR
本文提出了一种针对增强学习的不确定性和时间距离感知课程目标生成方法,通过解决二分图匹配问题,为课程提供精确的指导,从而更好地解决了先前课程RL方法中存在的问题,并在数量和质量上显著优于这些方法。
Abstract
Current
reinforcement learning
(RL) often suffers when solving a challenging exploration problem where the desired outcomes or high rewards are rarely observed. Even though
curriculum rl
, a framework that solves
→