BriefGPT.xyz
Jan, 2021
可微分信任域层用于深度强化学习
Differentiable Trust Region Layers for Deep Reinforcement Learning
HTML
PDF
Fabian Otto, Philipp Becker, Ngo Anh Vien, Hanna Carolin Ziesche, Gerhard Neumann
TL;DR
本文提出了可微分的神经网络层来通过闭合形式的投影来执行深度高斯策略的信任区域,为Gaussian分布导出了基于KL散度、Wasserstein L2距离和Frobenius范数的信任区域投影。实验证明,这些投影层可以实现类似或更好的结果,而且几乎对于具体的实现选择是不敏感的。
Abstract
trust region methods
are a popular tool in
reinforcement learning
as they yield robust policy updates in continuous and discrete action spaces. However, enforcing such trust regions in deep
→