TL;DR提出了一种称为 ARAML 的新框架,通过采用最大似然估计算法以及与强化学习中不同的奖励函数,来优化文本 GAN 模型的性能。实验证明,这种模型能比现有模型更稳定、更优秀地生成文本。
Abstract
Most of the existing generative adversarial networks (GAN) for text
generation suffer from the instability of reinforcement learning training
algorithms such as policy gradient, leading to unstable performance. T