通过反事实轨迹解释强化学习策略

Jan, 2022

通过反事实轨迹解释强化学习策略

Explaining Reinforcement Learning Policies through Counterfactual Trajectories

Julius Frost, Olivia Watkins, Eric Weiner, Pieter Abbeel, Trevor Darrell...

TL;DR通过展示强化学习代理在更广泛的轨迹分布中的行为，我们的方法可以传达代理在分布转移下的表现，从而有助于代理的有效验证。在用户研究中，我们展示了我们的方法可以使用户在代理验证任务中的得分比基准方法高。

Abstract

In order for humans to confidently decide where to employ RL agents for real-world tasks, a human developer must validate that the agent will perform well at test-time. Some policy interpretability methods facilitate this by capturing the policy's decision making in a set of agent roll