BriefGPT.xyz
Sep, 2022
VIP:通过价值内隐预训练实现通用视觉奖励和表示
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
HTML
PDF
Yecheng Jason Ma, Shagun Sodhani, Dinesh Jayaraman, Osbert Bastani, Vikash Kumar...
TL;DR
本研究提出了一种称为VIP的表示自学习方法,通过自监督目标条件强化学习的方式从未标注的人类视频中生成稠密的,可平滑的奖励函数,克服机器人数据获取上的困难,并在实验中表现出优异的表现。
Abstract
Reward and
representation learning
are two long-standing challenges for learning an expanding set of
robot manipulation skills
from sensory observations. Given the inherent cost and scarcity of in-domain, task-sp
→