BriefGPT.xyz
Oct, 2023
章鱼: 基于环境反馈的视觉语言程序员
Octopus: Embodied Vision-Language Programmer from Environmental Feedback
HTML
PDF
Jingkang Yang, Yuhao Dong, Shuai Liu, Bo Li, Ziyue Wang...
TL;DR
Octopus是一种新颖的大视觉-语言模型,能够有效地解读代理的视觉和文本任务目标,并制定复杂的行动序列和生成可执行代码,通过开源模型结构、模拟器和数据集,旨在激发更多创新,并在更广泛的具身化人工智能社区中促进协作应用。
Abstract
large vision-language models
(VLMs) have achieved substantial progress in multimodal perception and reasoning. Furthermore, when seamlessly integrated into an
embodied agent
, it signifies a crucial stride towards
→