BriefGPT.xyz
May, 2023
从大型矿石中提炼金:通过关键样本选择实现高效数据集精馏
Distill Gold from Massive Ores: Efficient Dataset Distillation via Critical Samples Selection
HTML
PDF
Yue Xu, Yong-Lu Li, Kaitong Cui, Ziyu Wang, Cewu Lu...
TL;DR
本文提出了一种基于信息理论和样本价值的新的数据集精简方法,经过全面的数据选择分析,该方法能够极大的降低训练成本,扩展现有的精简算法到更大规模、更多元化的数据集上,并且能够在多种不同类型的数据集上持续提高性能。
Abstract
data-efficient learning
has drawn significant attention, especially given the current trend of large multi-modal models, where
dataset distillation
can be an effective solution. However, the
→