BriefGPT.xyz
Jul, 2023
对比例子为基础的控制
Contrastive Example-Based Control
HTML
PDF
Kyle Hatch, Benjamin Eysenbach, Rafael Rafailov, Tianhe Yu, Ruslan Salakhutdinov...
TL;DR
基于示例的学习方法提出了一种离线控制方法,该方法学习了一个隐式模型来表示多步转变的Q值,并在状态和图像离线控制任务中优于基准方法并展现了对数据集规模的提升和鲁棒性。
Abstract
While many real-world problems that might benefit from
reinforcement learning
, these problems rarely fit into the
mdp
mold: interacting with the environment is often expensive and specifying reward functions is c
→