In Formula One, teams compete to develop their cars and achieve the highest possible finishing position in each race. During a race, however, teams are unable to alter the car, so they must improve their cars' finishing positions via race strategy, i.e. optimising their selection of which tyre compounds to put on the car and when to do so. In this work, we introduce a reinforcement learning model, RSRL (Race Strategy Reinforcement Learning), to control race strategies in simulations, offering a faster alternative to the industry standard of hard-coded and Monte Carlo-based race strategies. Controlling cars with a pace equating to an expected finishing position of P5.5 (where P1 represents first place and P20 is last place), RSRL achieves an average finishing position of P5.33 on our test race, the 2023 Bahrain Grand Prix, outperforming the best baseline of P5.63. We then demonstrate, in a generalisability study, how performance for one track or multiple tracks can be prioritised via training. Further, we supplement model predictions with feature importance, decision tree-based surrogate models, and decision tree counterfactuals towards improving user trust in the model. Finally, we provide illustrations which exemplify our approach in real-world situations, drawing parallels between simulations and reality.

本文针对一级方程式赛车比赛策略的优化问题，引入了一种新的强化学习模型RSRL（赛车策略强化学习），提供了一种比传统硬编码和基于蒙特卡洛的策略更快速的替代方案。研究表明，RSRL在2023年巴林大奖赛中实现了平均名次P5.33，优于最佳基线P5.63，并通过特征重要性和决策树模型增强了模型的可解释性和用户信任度。

可解释的强化学习在一级方程式比赛策略中的应用