Game-theoretic inverse learning is the problem of inferring the players' objectives from their actions. We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system. We propose an active inverse learning method for the leader to infer which hypothesis among a finite set of candidates describes the follower's objective function. Instead of using passively observed trajectories like existing methods, the proposed method actively maximizes the differences in the follower's trajectories under different hypotheses to accelerate the leader's inference. We demonstrate the proposed method in a receding-horizon repeated trajectory game. Compared with uniformly random inputs, the leader inputs provided by the proposed method accelerate the convergence of the probability of different hypotheses conditioned on the follower's trajectory by orders of magnitude.

博弈论逆学习是从玩家的行动推断他们的目标的问题。我们在一场领导者与追随者的Stackelberg博弈中，将逆学习问题制定为每个玩家的行动是动力系统的轨迹。我们提出了一种主动的逆学习方法，用于领导者推断有限集候选中哪个假设描述追随者的目标函数。与现有方法不同，该方法主动地最大化了不同假设下追随者轨迹的差异，以加速领导者的推断。我们在一场时变轨迹重复博弈中演示了该方法。与均匀随机输入相比，该方法提供的领导者输入将追随者轨迹的条件下不同假设的概率收敛加速了数个数量级。

Stackelberg轨迹博弈中的主动逆向学习