BriefGPT.xyz
Aug, 2022
基于聚类的反事实经验回放在机器人控制中的应用
Cluster-based Sampling in Hindsight Experience Replay for Robot Control
HTML
PDF
Taeyoung Kim, Dongsoo Har
TL;DR
提出了一种基于聚类的采样策略,利用成就目标的属性对轨迹进行分组,并在此基础上采样经验,用于解决多目标强化学习中稀疏奖励的问题。实验结果表明,该方法在三个机器人控制任务中具有显著的优化效果,可以缩短模型收敛时间和提升成功率。
Abstract
In
multi-goal reinforcement learning
in an environment, agents learn policies to achieve multiple goals by using experiences gained from interactions with the environment. With a
sparse binary reward
, training ag
→