BriefGPT.xyz
Apr, 2021
自然语言处理中的模型解释的敏感性和稳定性
On the Faithfulness Measurements for Model Interpretations
HTML
PDF
Fan Yin, Zhouxing Shi, Cho-Jui Hsieh, Kai-Wei Chang
TL;DR
该研究提出两个衡量NLP模型判断过程解释准确性的新标准,即灵敏度和稳定性,并引入了一种新的基于对抗性鲁棒性的解释方法,证明了其在相应标准下的优越性。还应用该方法和度量标准在依赖分析中。
Abstract
Recent years have witnessed the emergence of a variety of
post-hoc interpretations
that aim to uncover how
natural language processing
(NLP) models make predictions. Despite the surge of new interpretations, it r
→