Learned language-conditioned robot policies often struggle to effectively adapt to new real-world tasks even when pre-trained across a diverse set of instructions. We propose a novel approach for few-shot adaptation to unseen tasks that exploits the semantic understanding of task decomposition provided by vision-language models (VLMs). Our method, Policy Adaptation via Language Optimization (PALO), combines a handful of demonstrations of a task with proposed language decompositions sampled from a VLM to quickly enable rapid nonparametric adaptation, avoiding the need for a larger fine-tuning dataset. We evaluate PALO on extensive real-world experiments consisting of challenging unseen, long-horizon robot manipulation tasks. We find that PALO is able of consistently complete long-horizon, multi-tier tasks in the real world, outperforming state of the art pre-trained generalist policies, and methods that have access to the same demonstrations.

本研究解决了学习的语言条件机器人策略在适应新实际任务时的低效问题。我们提出了一种名为PALO的创新方法，通过利用视觉-语言模型对任务分解的语义理解，结合少量示例和语言分解，实现快速的非参数适应。实验结果表明，PALO在长时限、多层次的任务中表现优越，超越了现有的状态下的预训练通用策略。

通过语言优化进行政策适应：对少样本模仿任务的分解