Head-to-head autonomous racing is a challenging problem, as the vehicle needs to operate at the friction or handling limits in order to achieve minimum lap times while also actively looking for strategies to overtake/stay ahead of the opponent. In this work we propose a head-to-head racing environment for reinforcement learning which accurately models vehicle dynamics. Some previous works have tried learning a policy directly in the complex vehicle dynamics environment but have failed to learn an optimal policy. In this work, we propose a curriculum learning-based framework by transitioning from a simpler vehicle model to a more complex real environment to teach the reinforcement learning agent a policy closer to the optimal policy. We also propose a control barrier function-based safe reinforcement learning algorithm to enforce the safety of the agent in a more effective way while not compromising on optimality.

头对头自主赛车的最优策略研究中，我们提出了一个基于课程学习的框架来逐步过渡到更复杂的真实环境，以教授强化学习代理一个更接近最优策略的方法，并提出了基于控制屏障函数的安全强化学习算法，既能有效保证代理的安全性又不会牺牲策略的最优性。

朝向最优头对头自主赛车的课程加强学习