Learning from Demonstration (LfD) is a powerful method for enabling robots to perform novel tasks as it is often more tractable for a non-roboticist end-user to demonstrate the desired skill and for the robot to efficiently learn from the associated data than for a human to engineer a reward function for the robot to learn the skill via reinforcement learning (RL). Safety issues arise in modern LfD techniques, e.g., Inverse Reinforcement Learning (IRL), just as they do for RL; yet, safe learning in LfD has received little attention. In the context of agile robots, safety is especially vital due to the possibility of robot-environment collision, robot-human collision, and damage to the robot. In this paper, we propose a safe IRL framework, CBFIRL, that leverages the Control Barrier Function (CBF) to enhance the safety of the IRL policy. The core idea of CBFIRL is to combine a loss function inspired by CBF requirements with the objective in an IRL method, both of which are jointly optimized via gradient descent. In the experiments, we show our framework performs safer compared to IRL methods without CBF, that is $\sim15\%$ and $\sim20\%$ improvement for two levels of difficulty of a 2D racecar domain and $\sim 50\%$ improvement for a 3D drone domain.

本文提出了一个名为CBFIRL的安全IRL框架，利用控制屏障函数（CBF）增强IRL策略的安全性，通过梯度下降联合优化CBF要求灵感的损失函数和IRL方法的目标，在实验中表明我们的框架相较于没有CBF的IRL方法更加安全，并展示了在2D赛车领域的两个难度级别和3D无人机领域的安全性提高了15%和20%和50%。

基于控制壁函数的安全反向强化学习