BriefGPT.xyz
Dec, 2023
扩散奖励:通过条件化视频扩散学习奖励
Diffusion Reward: Learning Rewards via Conditional Video Diffusion
HTML
PDF
Tao Huang, Guangqi Jiang, Yanjie Ze, Huazhe Xu
TL;DR
通过专家视频学习奖励,我们提出了一种名为Diffusion Reward的框架,通过条件视频扩散模型学习奖励,以解决复杂的视觉强化学习问题。
Abstract
learning rewards
from
expert videos
offers an affordable and effective solution to specify the intended behaviors for
reinforcement learning
→