灰盒子：理解DQNs

Feb, 2016

Graying the black box: Understanding DQNs

Tom Zahavy, Nir Ben Zrihem, Shie Mannor

TL;DR本文介绍了分析Deep Q-networks（DQNs）的一种方法和工具，以及自动学习Semi Aggregated Markov Decision Process（SAMDP）模型的算法。SAMDP模型允许我们直接从特征中识别时空抽象，并且可以在今后的工作中用作子目标检测器。使用我们的工具，我们揭示了DQNs学习的特征以层次方式聚合状态空间，解释了其成功。此外，我们能够理解和描述DQNs为三个不同的Atari2600游戏学习的策略，并提出解释、调试和优化强化学习中深度神经网络的方式。

Abstract

In recent years there is a growing interest in using deep representations for reinforcement learning. In this paper, we present a methodology and tools to analyze Deep Q-networks (DQNs) in a non-blind matter. Usi