Enhancing generalization and uncertainty quantification in pre-trained language models (PLMs) is crucial for their effectiveness and reliability. Building on machine learning research that established the importance of robustness for improving generalization, we investigate the role of representation smoothness, achieved via Jacobian and Hessian regularization, in enhancing PLM performance. Although such regularization methods have proven effective in computer vision, their application in natural language processing (NLP), where PLM inputs are derived from a discrete domain, poses unique challenges. We introduce a novel two-phase regularization approach, JacHess, which minimizes the norms of the Jacobian and Hessian matrices within PLM intermediate representations relative to their inputs. Our evaluation using the GLUE benchmark demonstrates that JacHess significantly improves in-domain generalization and calibration in PLMs, outperforming unregularized fine-tuning and other similar regularization methods.

通过Jacobian和Hessian正则化方法，本研究探讨了提高预训练语言模型(PLM)泛化性和不确定性量化的重要性。我们引入了一种新的两阶段正则化方法JacHess，通过最小化PLM中间表示的Jacobian和Hessian矩阵与其输入之间的范数来实现。使用GLUE基准测试，我们的评估表明，JacHess在PLM的领域内泛化和校准方面取得了显著改进，优于未正则化的微调和其他类似的正则化方法。

从健壮性到预训练语言模型的改进泛化和校准