BriefGPT.xyz
Dec, 2022
只需要一个好的嵌入提取器就能解决虚假相关性
You Only Need a Good Embeddings Extractor to Fix Spurious Correlations
HTML
PDF
Raghav Mehta, Vítor Albiero, Li Chen, Ivan Evtimov, Tamar Glaser...
TL;DR
本文研究了深度神经网络在处理非真实相关数据时的稳健性问题,提出了一种不需要子分组信息训练、只需要将预训练模型的嵌入向量作为特征的线性分类器,实现了90%的准确率。实验表明,预训练模型的容量和数据集大小是影响效果的因素。
Abstract
spurious correlations
in training data often lead to
robustness issues
since models learn to use them as shortcuts. For example, when predicting whether an object is a cow, a model might learn to rely on its gree
→