BriefGPT.xyz
Oct, 2023
元工具基准:决定是否使用工具以及选择哪个工具
MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use
HTML
PDF
Yue Huang, Jiawen Shi, Yuan Li, Chenrui Fan, Siyuan Wu...
TL;DR
本文提出了 MetaTool,这是一个用于评估大型语言模型(LLMs)的工具使用意识和正确选择工具能力的基准测试,并通过实验证明大多数LLMs在工具选择方面仍然存在困难。
Abstract
large language models
(LLMs) have garnered significant attention due to their impressive natural language processing (NLP) capabilities. Recently, many studies have focused on the
tool utilization
ability of LLMs
→