BriefGPT.xyz
Oct, 2017
反向传播虚空:针对黑盒梯度估计优化控制变量
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
HTML
PDF
Will Grathwohl, Dami Choi, Yuhuai Wu, Geoff Roeder, David Duvenaud
TL;DR
本研究介绍了一种通用框架,用于学习随机变量的黑盒函数的低方差、无偏梯度估计器,并应用于训练离散潜变量模型以及提出了基于优势演员-评论家强化学习算法的无偏、行为条件扩展。
Abstract
gradient-based optimization
is the foundation of
deep learning
and
reinforcement learning
. Even when the mechanism being optimized is unkn
→