BriefGPT.xyz
Dec, 2023
模型面包屑: 利用稀疏掩码扩展多任务模型合并
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
HTML
PDF
MohammadReza Davari, Eugene Belilovsky
TL;DR
这项研究提出了一种名为Model Breadcrumbs的新方法,通过在预训练模型的权重空间内的轨迹上雕刻一组稀疏定义的权重,从而增强任务性能,并在多个任务中同时改善性能,为构建多任务模型和更新基础模型提供了一种简单、高效和极其有效的方法。
Abstract
The rapid development of
ai systems
has been greatly influenced by the emergence of
foundation models
. A common approach for targeted problems involves
→