BriefGPT.xyz
May, 2024
理解LLMs需要超越统计概括
Understanding LLMs Requires More Than Statistical Generalization
HTML
PDF
Patrik Reizinger, Szilvia Ujváry, Anna Mészáros, Anna Kerekes, Wieland Brendel...
TL;DR
对深度学习的广义化现象、超参数化模型、非可识别性以及归纳偏见进行研究,并针对语言模型相关的广义化度量、可迁移性和归纳偏见提出了有前景的研究方向。
Abstract
The last decade has seen blossoming research in
deep learning
theory attempting to answer, "Why does
deep learning
generalize?" A powerful shift in perspective precipitated this progress: the study of
→