数据增强作为特征操作

Mar, 2022

Data Augmentation as Feature Manipulation: a story of desert cows and grass cows

Ruoqi Shen, Sébastien Bubeck, Suriya Gunasekar

TL;DR本文研究数据增强对学习过程动态的影响，发现数据增强可以改变各种特征的相对重要性，特别是对于神经网络等非线性模型更为明显，可以被看作是特征操作。

Abstract

data augmentation is a cornerstone of the machine learning pipeline, yet its theoretical underpinnings remain unclear. Is it merely a way to artificially augment the data set size? Or is it about encouraging the