Inverse reinforcement learning (IRL) aims to infer a reward from expert
demonstrations, motivated by the idea that the reward, rather than the policy,
is the most succinct and transferable description of a task [Ng et al., 2000].
However, the reward corresponding to an optimal policy is not unique, making it
unclear if an IRL-learned reward is transferable to new transition laws in the
sense that its optimal policy aligns with the optimal policy corresponding to
the expert's true reward. Past work has addressed this problem only under the
assumption of full access to the expert's policy, guaranteeing transferability
when learning from two experts with the same reward but different transition
laws that satisfy a specific rank condition [Rolland et al., 2022]. In this
work, we show that the conditions developed under full access to the expert's
policy cannot guarantee transferability in the more practical scenario where we
have access only to demonstrations of the expert. Instead of a binary rank
condition, we propose principal angles as a more refined measure of similarity
and dissimilarity between transition laws. Based on this, we then establish two
key results: 1) a sufficient condition for transferability to any transition
laws when learning from at least two experts with sufficiently different
transition laws, and 2) a sufficient condition for transferability to local
changes in the transition law when learning from a single expert. Furthermore,
we also provide a probably approximately correct (PAC) algorithm and an
end-to-end analysis for learning transferable rewards from demonstrations of
multiple experts.

逆强化学习旨在从专家示范中推断出奖励，但奖励与最优策略不唯一，本文提出主角度作为衡量转移规律相似性和差异性的更精细度量，建立了两个关键结果：1）当学习来自至少两个转移规律明显不同的专家时，对任何转移规律的可转移性提供了足够条件；2）当从单个专家学习时，对转移规律的局部变化的可转移性提供了足够条件，并提供了概率近似正确（PAC）算法和端到端分析，用于从多个专家的示范中学习可转移的奖励。