Unrolled computation graphs arise in many scenarios, including training RNNs, tuning hyperparameters through unrolled optimization, and training learned optimizers. Current approaches to optimizing parameters in such computation graphs suffer from high variance gradients, bias, slow updates, or large memory usage. We introduce a method called Persistent Evolution Strategies (PES), which divides the computation graph into a series of truncated unrolls, and performs an evolution strategies-based update step after each unroll. PES eliminates bias from these truncations by accumulating correction terms over the entire sequence of unrolls. PES allows for rapid parameter updates, has low memory usage, is unbiased, and has reasonable variance characteristics. We experimentally demonstrate the advantages of PES compared to several other methods for gradient estimation on synthetic tasks, and show its applicability to training learned optimizers and tuning hyperparameters.

介绍了一种名为PES的方法，它使用持续进化策略更新参数，解决了在许多场景下出现的高方差梯度、偏差、慢速更新和大内存使用等问题，该方法将计算图分成一系列截断的展开，并在每个展开后执行一次进化策略更新步骤，通过在整个展开序列中累积修正项来消除这些截断的偏差。在合成任务上，实验结果表明PES相对于几种其他梯度估计方法具有更好的表现，且仍适用于训练学习优化器和调整超参数。

使用持续演化策略在展开计算图中进行无偏梯度估计