BriefGPT.xyz
Jan, 2023
马尔可夫决策过程的最优决策树策略
Optimal Decision Tree Policies for Markov Decision Processes
HTML
PDF
Daniël Vos, Sicco Verwer
TL;DR
本文提出了一种优化方法,通过线性规划直接优化有限深度的决策树,使其在马尔科夫决策过程中达到最优性能,可用于解决强化学习策略可解释性的问题。通过实验证明,这种方法在性能和可解释性之间取得了平衡。
Abstract
interpretability
of
reinforcement learning
policies is essential for many real-world tasks but learning such interpretable policies is a hard problem. Particularly rule-based policies such as
→