BriefGPT.xyz
Feb, 2018
DiCE: 无限可微分蒙特卡洛估计器
DiCE: The Infinitely Differentiable Monte-Carlo Estimator
HTML
PDF
Jakob Foerster, Greg Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing...
TL;DR
本文介绍了DiCE,一种在随机计算图中生成任意阶导数的正确估计量的单一目标函数,相比使用固定的样本进行逼近的Surrogate Loss方法,DiCE使用自动微分进行图形操作,能更好地解决上述问题,同时提出了DiCE在多智能体学习中的应用。
Abstract
The
score function estimator
is widely used for estimating gradients of stochastic objectives in
stochastic computation graphs
(SCG), eg. in reinforcement learning and meta-learning. While deriving the first-orde
→