BriefGPT.xyz
Mar, 2024
跨领域策略转移与效果循环一致性
Cross Domain Policy Transfer with Effect Cycle-Consistency
HTML
PDF
Ruiqi Zhu, Tianhong Dai, Oya Celiktutan
TL;DR
我们提出了一种使用未配对数据学习领域之间状态和动作空间的映射函数的新方法,通过对过渡效果进行对称优化结构的方案,将机器人策略从源领域无缝转移到目标领域,实现了对于不同状态和动作空间的机器人之间的迁移学习和显著降低对齐误差的方法。
Abstract
Training a
robotic policy
from scratch using
deep reinforcement learning
methods can be prohibitively expensive due to sample inefficiency. To address this challenge, transferring policies trained in the source d
→