The Inverse Reinforcement Learning (\textit{IRL}) problem has seen rapid
evolution in the past few years, with important applications in domains like
robotics, cognition, and health. In this work, we explore the inefficacy of
current IRL methods in learning an agent's reward function from expert
trajectories depicting long-horizon, complex sequential tasks. We hypothesize
that imbuing IRL models with structural motifs capturing underlying tasks can
enable and enhance their performance. Subsequently, we propose a novel IRL
method, SMIRL, that first learns the (approximate) structure of a task as a
finite-state-automaton (FSA), then uses the structural motif to solve the IRL
problem. We test our model on both discrete grid world and high-dimensional
continuous domain environments. We empirically show that our proposed approach
successfully learns all four complex tasks, where two foundational IRL
baselines fail. Our model also outperforms the baselines in sample efficiency
on a simpler toy task. We further show promising test results in a modified
continuous domain on tasks with compositional reward functions.

本文研究了当前 IRL 方法在长期和复杂的顺序任务中学习代理奖励函数的无效性，并提出了一种新的 IRL 方法 SMIRL，该方法将任务结构化为有限状态自动机，然后使用结构性动机来解决 IRL 问题。通过离散和高维度连续环境的测试实验，我们证明了该方法的有效性和高效性，并表明其在具有组合奖励函数的任务中仍然表现良好。