hierarchical policies that combine language and low-level control have been
shown to perform impressively long-horizon robotic tasks, by leveraging either
zero-shot high-level planners like pretrained language and vision-language
models (LLMs/VLMs) or models trained on annotated roboti