Reusing large datasets is crucial to scale vision-based robotic manipulators to everyday scenarios due to the high cost of collecting robotic datasets. However, robotic platforms possess varying control schemes, camera viewpoints, kinematic configurations, and end-effector morphologies, posing significant challenges when transferring manipulation skills from one platform to another. To tackle this problem, we propose a set of key design decisions to train a single policy for deployment on multiple robotic platforms. Our framework first aligns the observation and action spaces of our policy across embodiments via utilizing wrist cameras and a unified, but modular codebase. To bridge the remaining domain shift, we align our policy's internal representations across embodiments through contrastive learning. We evaluate our method on a dataset collected over 60 hours spanning 6 tasks and 3 robots with varying joint configurations and sizes: the WidowX 250S, the Franka Emika Panda, and the Sawyer. Our results demonstrate significant improvements in success rate and sample efficiency for our policy when using new task data collected on a different robot, validating our proposed design decisions. More details and videos can be found on our anonymized project website: https://sites.google.com/view/polybot-multirobot

通过重新使用大型数据集，提出在多个机器人平台上训练单一策略的关键设计决策，通过对视觉和动作空间的对齐以及对内部表示的对齐来解决机器人平台之间的领域差异，验证了该方法在不同机器人上收集的新任务数据中成功率和样本效率的显著提升。

Polybot：在接纳变异性的同时，为多个机器人训练一个策略