May, 2023
通过询问进行校准:从人类反馈 Fine-Tune 的语言模型中获取校准置信度得分的策略
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence
Scores from Language Models Fine-Tuned with Human Feedback
TL;DR本研究旨在评估从经过强化学习加人工反馈的预先训练语言模型中提取置信度得分的可行方法,通过合理的提示策略和温度缩放,成功降低超过50%的校准误差