Chenguang Wang, Yaodong Yang, Oliver Slumbers, Congying Han, Tiande Guo...
TL;DR利用基于PSRO(Policy Space Response Oracle)方法的两人零和博弈进行深度学习解算器的泛化能力提升,实现在不同的TSP任务中最大程度的通用性表现,最终实现了解算器人口的效用降低和达到Nash均衡。
Abstract
In this paper, we shed new light on the generalization ability of deep learning-based solvers for Traveling Salesman Problems (TSP). Specifically, we introduce a two-player zero-sum framework between a trainable