BriefGPT.xyz
Nov, 2023
基于好奇心探索的目标条件离线规划
Goal-conditioned Offline Planning from Curious Exploration
HTML
PDF
Marco Bagatella, Georg Martius
TL;DR
通过分析优化目标条件下的价值函数的几何特征,我们提出了一种使用基于模型的规划和基于图形的价值聚合方案相结合的方法来纠正学习价值函数中的估计伪像,并在各种模拟环境中显著提高了零-shot目标达成性能。
Abstract
curiosity
has established itself as a powerful exploration strategy in
deep reinforcement learning
. Notably, leveraging expected future novelty as intrinsic motivation has been shown to efficiently generate explo
→