BriefGPT.xyz
Mar, 2024
动物园中的Android: GUI代理的行动思维链
Android in the Zoo: Chain-of-Action-Thought for GUI Agents
HTML
PDF
Jiwen Zhang, Jihao Wu, Yihua Teng, Minghui Liao, Nuo Xu...
TL;DR
通过描述以前的操作、当前屏幕和选择操作所导致的结果的操作思考,Chain-of-Action-Thought架构与大型语言模型相结合,在智能手机上实现了通过自然语言触发的任务完成,显著提高了目标进展。
Abstract
large language model
(LLM) leads to a surge of
autonomous gui agents
for smartphone, which completes a task triggered by natural language through predicting a sequence of actions of API. Even though the task high
→