BriefGPT.xyz
Jul, 2021
Transformer模型是否能够衡量文本的连贯性?重新探讨洗牌测试
Can Transformer Models Measure Coherence In Text? Re-Thinking the Shuffle Test
HTML
PDF
Philippe Laban, Luke Dai, Lucas Bandarkar, Marti A. Hearst
TL;DR
本文使用RoBERTa模型对Shuffle Test任务进行了微调,并取得了97.8%的准确率,但提议使用Zero-Shot设置来评估模型;并引入k-Block Shuffle Test作为简单的测试NLP模型能力的方法。
Abstract
The
shuffle test
is the most common task to evaluate whether
nlp models
can measure coherence in text. Most recent work uses direct supervision on the task; we show that by simply finetuning a
→