TL;DR使用 Differentiable Decision Trees 学习可解释的奖励函数,研究表明其能够学习可解释的奖励函数,但树的离散性会降低强化学习的性能。
Abstract
There is an increasing interest in learning reward functions that model human
intent and human preferences. However, many frameworks use blackbox learning
methods that, while expressive, are difficult to interpre