共形核取样

May, 2023

Conformal Nucleus Sampling

Shauli Ravfogel, Yoav Goldberg, Jacob Goldberger

TL;DR本文通过使用conformal prediction方法对$p$参数进行校准来研究top-$p$采样在各种语言上下文环境下是否与其概率意义对齐，结果表明OPT模型存在过度自信，而校准与模型大小存在适度的反比关系。

Abstract

language models generate text based on successively sampling the next word. A decoding procedure based on nucleus (top-$p$) sampling chooses from the smallest possible set of words whose cumulative probability exceeds the probability $p$. In this work, we assess whether a top-$p$ set i