BriefGPT.xyz
Mar, 2024
探索数据集偏差对数据集提炼的影响
Exploring the Impact of Dataset Bias on Dataset Distillation
HTML
PDF
Yao Lu, Jianyang Gu, Xuguang Chen, Saeed Vahidian, Qi Xuan
TL;DR
利用数据集正规化技术(Dataset Distillation, DD)生成小规模的合成数据集,探究数据集偏倚对DD性能的影响并提出应对方法,实验证明原始数据集中的偏倚显著影响合成数据集的性能,突出了在DD过程中识别和减轻偏倚的必要性。
Abstract
dataset distillation
(DD) is a promising technique to synthesize a smaller dataset that preserves essential information from the original dataset. This
synthetic dataset
can serve as a substitute for the original
→