Recently, deep reinforcement learning (DRL) models have shown promising results in solving NP-hard Combinatorial Optimization (CO) problems. However, most DRL solvers can only scale to a few hundreds of nodes for combinatorial optimization problems on graphs, such as the Traveling Salesman Problem (TSP). This paper addresses the scalability challenge in large-scale combinatorial optimization by proposing a novel approach, namely, DIMES. Unlike previous DRL methods which suffer from costly autoregressive decoding or iterative refinements of discrete solutions, DIMES introduces a compact continuous space for parameterizing the underlying distribution of candidate solutions. Such a continuous space allows stable REINFORCE-based training and fine-tuning via massively parallel sampling. We further propose a meta-learning framework to enable the effective initialization of model parameters in the fine-tuning stage. Extensive experiments show that DIMES outperforms recent DRL-based methods on large benchmark datasets for Traveling Salesman Problems and Maximal Independent Set problems.

本文提出DIMES算法，通过采样来优化候选解的分布，使用连续空间来代替离散解空间，用稳定的REINFORCE方法进行训练和微调，同时利用元学习框架对模型参数进行有效的初始化来解决 DRL在大规模组合优化中的可扩展性挑战，实验证明DIMES 在旅行商问题和最大独立集问题的大型基准数据集上优于最近的DRL方法。

DIMES：可微元求解器用于组合优化问题