自然语言处理模型的泛化：概念与因果关系

Nov, 2023

自然语言处理模型的泛化：概念与因果关系

Generalization of NLP Models: Notion and Causation

Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor

TL;DR探索机器学习模型泛化能力的基础，研究影响因素，尤其关注内部有效性，外部有效性和虚假相关性，并指导分析泛化失败。

Abstract

The NLP community typically relies on performance of a model on a held-out test set to assess generalization. Performance drops observed in datasets outside of official test sets are generally attributed to "out-of-distribution'' effects. Here, we explore the foundations of generalizability