BriefGPT.xyz
Nov, 2018
深度策略梯度的深入探讨
Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms?
HTML
PDF
Andrew Ilyas, Logan Engstrom, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos...
TL;DR
研究了深度策略梯度算法的行为如何反映驱动其发展的概念框架,并提出了对最先进方法的细粒度分析。结果表明,深度策略梯度算法的行为经常偏离其驱动框架所预测的行为,这表明了我们对当前方法的了解不足,并提示需要超越当前基准中心的评估方法。
Abstract
We study how the behavior of
deep policy gradient algorithms
reflects the conceptual framework motivating their development. We propose a fine-grained analysis of state-of-the-art methods based on key aspects of this framework:
→