BriefGPT.xyz
Jun, 2022
前序特征
Predecessor Features
HTML
PDF
Duncan Bailey, Marcelo Mattar
TL;DR
探究了一种名为 'Predecessor Features' 的算法,它通过维护一个近似过去积累经验和的方法,允许将时序差分误差准确地传播到比传统方法更多的前身状态中,从而大大提高了增强学习的效率和性能。
Abstract
Any
reinforcement learning
system must be able to identify which past events contributed to observed outcomes, a problem known as
credit assignment
. A common solution to this problem is to use an
→