BriefGPT.xyz
Oct, 2024
通过静态贝尔曼误差最大化实现确定性探索
Deterministic Exploration via Stationary Bellman Error Maximization
HTML
PDF
Sebastian Griesbach, Carlo D'Eramo
TL;DR
本研究针对强化学习中探索问题的挑战,提出了一种新的体系结构,通过对贝尔曼误差进行稳定优化,以实现确定性探索策略。我们的方法不仅使用以前的经验来优化探索过程,还为探索目标引入了与试验长度无关的策略,从而在稠密和稀疏奖励环境中超越了传统的ε-greedy策略。
Abstract
Exploration
is a crucial and distinctive aspect of
Reinforcement Learning
(RL) that remains a fundamental open problem. Several methods have been proposed to tackle this challenge. Commonly used methods inject ra
→