BriefGPT.xyz
Mar, 2024
神经组合优化的自我改进:无替换抽样,仅改善
Self-Improvement for Neural Combinatorial Optimization: Sample without Replacement, but Improvement
HTML
PDF
Jonathan Pirnay, Dominik G. Grimm
TL;DR
通过结合行为克隆和增强学习方法,本文简化了端到端的神经组合优化训练过程,采用随机抽样解决方案并利用概率策略改进来提高模型性能,在旅行推销员问题和车辆路径问题方面取得了令人满意的结果,并应用于作业车间调度问题,超越现有的方法。
Abstract
Current methods for end-to-end constructive
neural combinatorial optimization
usually train a policy using behavior cloning from expert solutions or
policy gradient methods
from reinforcement learning. While beha
→