通过预期结果解释代理行为：你认为会发生什么？

Nov, 2020

通过预期结果解释代理行为：你认为会发生什么？

What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes

Herman Yau, Chris Russell, Simon Hadfield

TL;DR论文通过意图结果的概念，提出了一种新的加强学习解释形式，介绍了针对几种Q函数逼近的局部解释提取方法，并在多个环境和算法上进行了演示。

Abstract

We present a novel form of explanation for reinforcement learning, based around the notion of intended outcome. These explanations describe the outcome an agent is trying to achieve by its actions. We provide a s