概率函数下降：关于GAN、变分推断和强化学习的统一视角

Jan, 2019

概率函数下降：关于GAN、变分推断和强化学习的统一视角

Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning

Casey Chu, Jose Blanchet, Peter Glynn

TL;DR该论文提供了一个新的关于机器学习问题的统一观点，将其框架化为在概率量度空间上定义的泛函最小化问题。通过这个框架，我们可以将生成对抗网络、变分推断以及强化学习中的演员-评论家方法等看作是同一问题。我们介绍了泛函梯度下降（PFD）算法，并展示了它如何恢复已有的相应算法及其独立开发的背景。

Abstract

The goal of this paper is to provide a unifying view of a wide range of problems of interest in machine learning by framing them as the minimization of functionals defined on the space of probability measures. In particular, we show that →