BriefGPT.xyz
Nov, 2024
通过目标条件探索将视频模型与动作对接
Grounding Video Models to Actions through Goal Conditioned Exploration
HTML
PDF
Yunhao Luo, Yilun Du
TL;DR
本研究解决了大型视频模型缺乏具身代理的具体应用问题,提出了一种通过自我探索直接将视频模型与连续动作对接的新方法。研究表明,该框架能够在没有外部监督的情况下解决复杂任务,其表现与多种基于专家演示的行为克隆基线相当或更优,具有重要的应用潜力。
Abstract
Large
Video Models
, pretrained on massive amounts of Internet video, provide a rich source of physical knowledge about the dynamics and motions of objects and tasks. However,
Video Models
are not grounded in the
→