隐式分位数网络用于分布式强化学习

Jun, 2018

隐式分位数网络用于分布式强化学习

Implicit Quantile Networks for Distributional Reinforcement Learning

Will Dabney, Georg Ostrovski, David Silver, Rémi Munos

TL;DR本文介绍了一种基于分布式强化学习的方法，通过使用分位回归来逼近状态-动作回报分布的全量位函数来得到一个灵活、高效且可应用于各种环境的动态规划方法，并通过在57个Atari 2600游戏中的表现来展示算法的性能，并使用其隐式定义的分布来研究风险敏感性政策在Atari游戏中的效果。

Abstract

In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regressi