BriefGPT.xyz
Oct, 2023
半监督学习的通用奖励模型:SemiReward
SemiReward: A General Reward Model for Semi-supervised Learning
HTML
PDF
Siyuan Li, Weiyang Jin, Zedong Wang, Fang Wu, Zicheng Liu...
TL;DR
提出了一种SemiReward半监督奖励框架,通过预测奖励分数来评估和过滤高质量伪标签,以解决在半监督学习中确认偏差问题,实现高质量标签、快速收敛和任务多样性。
Abstract
semi-supervised learning
(SSL) has witnessed great progress with various improvements in the
self-training framework
with
pseudo labeling
.
→