BriefGPT.xyz
Oct, 2023
基于难度对齐轨迹匹配的无损数据集蒸馏
Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching
HTML
PDF
Ziyao Guo, Kai Wang, George Cazenavette, Hui Li, Kaipeng Zhang...
TL;DR
开发一种能随着合成数据集规模增长而保持有效的新型数据集精馏方法,通过早期或晚期的轨迹匹配,成功将轨迹匹配方法扩展到更大的合成数据集,首次实现了无损数据集精馏。
Abstract
The ultimate goal of
dataset distillation
is to synthesize a small
synthetic dataset
such that a model trained on this synthetic set will perform equally well as a model trained on the full, real dataset. Until n
→