BriefGPT.xyz
Oct, 2022
探索预训练语言模型的模式连通性
Exploring Mode Connectivity for Pre-trained Language Models
HTML
PDF
Yujia Qin, Cheng Qian, Jing Yi, Weize Chen, Yankai Lin...
TL;DR
本文研究了预训练语言模型在不同配置下产生的极小值之间的几何连接及其任务知识随路径的变化情况,旨在通过探索 PLM 的模式连接来了解不同最小值之间的几何连接,从而有助于理解 PLM 下游适应的内部工作机制。
Abstract
Recent years have witnessed the prevalent application of
pre-trained language models
(PLMs) in
nlp
. From the perspective of parameter space, PLMs provide generic initialization, starting from which high-performan
→