We propose an object-centric recovery policy framework to address the challenges of out-of-distribution (OOD) scenarios in visuomotor policy learning. Previous behavior cloning (BC) methods rely heavily on a large amount of labeled data coverage, failing in unfamiliar spatial states. Without relying on extra data collection, our approach learns a recovery policy constructed by an inverse policy inferred from object keypoint manifold gradient in the original training data. The recovery policy serves as a simple add-on to any base visuomotor BC policy, agnostic to a specific method, guiding the system back towards the training distribution to ensure task success even in OOD situations. We demonstrate the effectiveness of our object-centric framework in both simulation and real robot experiments, achieving an improvement of 77.7% over the base policy in OOD. Project Website: https://sites.google.com/view/ocr-penn

本研究解决了视觉运动策略学习中分布外（OOD）场景的挑战，提出了一种以对象为中心的恢复策略框架。通过从原始训练数据中的对象关键点流形梯度推导逆策略，该方法能够在没有额外数据的情况下学习恢复策略，增强了传统行为克隆方法在OOD情境下的任务成功率，实验显示其相较于基线策略提升了77.7%。

以对象为中心的逆策略恢复框架用于视觉运动模仿学习中的分布外恢复