BriefGPT.xyz
Jan, 2025
通过最优利用解开大型语言模型的探索
Disentangling Exploration of Large Language Models by Optimal Exploitation
HTML
PDF
Tim Grams, Patrick Betz, Christian Bartelt
TL;DR
本研究针对大型语言模型在状态空间探索中的不足,提出了将探索作为唯一目标的新评估方法。通过将缺失的奖励分解为探索与利用组件,我们的实验证明,大型模型在探索性能上优于较小模型,同时提供了改进模型在探索任务表现的宝贵工具。
Abstract
Exploration
is a crucial skill for self-improvement and open-ended problem-solving. However, it remains uncertain whether
Large Language Models
can effectively explore the
→