BriefGPT.xyz
Jun, 2023
消失偏差启发式引导的强化学习算法
Vanishing Bias Heuristic-guided Reinforcement Learning Algorithm
HTML
PDF
Qinru Li, Hao Xiang
TL;DR
本研究在月球着陆环境中研究了强化学习的经典方法及基于神经网络的方法,并提出了一种名为Heuristic RL的新算法,通过引入启发式技巧来指导早期阶段的训练,同时减轻人为偏见的影响。实验结果表明,我们提出的方法在月球着陆环境中表现出良好的效果。
Abstract
reinforcement learning
has achieved tremendous success in the many Atari games. In this paper we explored with the
lunar lander environment
and implemented classical methods including
→