Traffic simulators are used to generate data for learning in intelligent transportation systems (ITSs). A key question is to what extent their modelling assumptions affect the capabilities of ITSs to adapt to various scenarios when deployed in the real world. This work focuses on two simulators commonly used to train reinforcement learning (RL) agents for traffic applications, CityFlow and SUMO. A controlled virtual experiment varying driver behavior and simulation scale finds evidence against distributional equivalence in RL-relevant measures from these simulators, with the root mean squared error and KL divergence being significantly greater than 0 for all assessed measures. While granular real-world validation generally remains infeasible, these findings suggest that traffic simulators are not a deus ex machina for RL training: understanding the impacts of inter-simulator differences is necessary to train and deploy RL-based ITSs.

交通模拟器对于智能交通系统的学习数据生成起着重要作用。本研究通过对两种常用交通应用强化学习（RL）代理训练的模拟器 CityFlow 和 SUMO 进行控制实验，发现它们在 RL 相关度量指标上存在分布等效性的问题，暗示交通模拟器对 RL 训练并非万能解决方案。了解不同模拟器之间的差异对于训练和部署基于 RL 的智能交通系统至关重要。

机器中的目的：交通模拟器是否能为强化学习应用程序产生分布等效的结果？