This paper combines two contributions. First, we introduce an extension of the Meta-World benchmark, which we call "Language-World," which allows a large language model to operate in a simulated robotic environment using semi-structured natural language queries and scripted skills described using natural language. By using the same set of tasks as Meta-World, Language-World results can be easily compared to Meta-World results, allowing for a point of comparison between recent methods using Large Language Models (LLMs) and those using Deep Reinforcement Learning. Second, we introduce a method we call Plan Conditioned Behavioral Cloning (PCBC), that allows finetuning the behavior of high-level plans using end-to-end demonstrations. Using Language-World, we show that PCBC is able to achieve strong performance in a variety of few-shot regimes, often achieving task generalization with as little as a single demonstration. We have made Language-World available as open-source software at https://github.com/krzentner/language-world/.

该论文介绍了一个名为“语言世界”的元世界基准的扩展，该基准允许使用大型语言模型在模拟机器人环境中使用半结构化自然语言查询和使用自然语言描述的脚本技能。通过使用与元世界相同的任务集，可以轻松将语言世界的结果与元世界的结果进行比较，从而比较使用大型语言模型和使用深度强化学习的最新方法之间的差异。其次，该论文介绍了一种名为“Plan Conditioned Behavioral Cloning”的方法，该方法允许使用端到端演示来优化高级计划的行为。使用语言世界，我们展示了PCBC能够在各种少样本情况下实现强大的性能，通常只需要一个演示即可实现任务的泛化。我们已经将语言世界作为开源软件提供，链接为https://URL。

使用大型语言模型条件性地组合机器人技能