BriefGPT.xyz
Jun, 2012
一种基于Dantzig Selector的时序差分学习方法
A Dantzig Selector Approach to Temporal Difference Learning
HTML
PDF
Matthieu Geist, Bruno Scherrer, Alessandro Lazaric, Mohammad Ghavamzadeh
TL;DR
本文介绍了一种新的算法,通过将 LSTD 与 Dantzig Selector 结合,解决了 L1 正则化与 LSTD 整合的困难问题,该算法适用于高维问题。
Abstract
lstd
is a popular algorithm for
value function approximation
. Whenever the number of features is larger than the number of samples, it must be paired with some form of regularization. In particular,
→