The dynamic vehicle dispatching problem corresponds to deciding which vehicles to assign to requests that arise stochastically over time and space. It emerges in diverse areas, such as in the assignment of trucks to loads to be transported; in emergency systems; and in ride-hailing services. In this paper, we model the problem as a semi-Markov decision process, which allows us to treat time as continuous. In this setting, decision epochs coincide with discrete events whose time intervals are random. We argue that an event-based approach substantially reduces the combinatorial complexity of the decision space and overcomes other limitations of discrete-time models often proposed in the literature. In order to test our approach, we develop a new discrete-event simulator and use double deep q-learning to train our decision agents. Numerical experiments are carried out in realistic scenarios using data from New York City. We compare the policies obtained through our approach with heuristic policies often used in practice. Results show that our policies exhibit better average waiting times, cancellation rates and total service times, with reduction in average waiting times of up to 50% relative to the other tested heuristic policies.

本文提出了一种基于半马尔可夫决策过程和离散事件模拟的动态车辆调度问题解决方法，并在纽约市的真实场景下进行了数值实验，结果显示相对于其他启发式策略，该方法的平均等待时间减少了50%。

基于事件的深度强化学习应用于动态车辆调度问题