Xiaocheng Tang, Fan Zhang, Zhiwei, Qin, Yansheng Wang...
TL;DR本文提出了一个名为V1D3的value-based动态学习框架,它可以同时处理车辆分配和重新定位问题,并结合在线体验和历史轨迹数据进行周期性合成,取得了大幅提升,成为KDD Cup 2020 RL竞赛中车辆调度和定位的冠军。
Abstract
Large ride-hailing platforms, such as DiDi, Uber and Lyft, connect tens of thousands of vehicles in a city to millions of ride demands throughout the day, providing great promises for improving transportation efficiency through the tasks of order dispatching and →