We study a novel architecture and training procedure for locomotion tasks. A high-frequency, low-level "spinal" network with access to proprioceptive sensors learns sensorimotor primitives by training on simple tasks. This pre-trained module is fixed and connected to a low-frequency, high-level "cortical" network, with access to all sensors, which drives behavior by modulating the inputs to the spinal network. Where a monolithic end-to-end architecture fails completely, learning with a pre-trained spinal module succeeds at multiple high-level tasks, and enables the effective exploration required to learn from sparse rewards. We test our proposed architecture on three simulated bodies: a 16-dimensional swimming snake, a 20-dimensional quadruped, and a 54-dimensional humanoid. Our results are illustrated in the accompanying video at https://youtu.be/sboPYvhpraQ

研究一种新的架构和训练程序，通过训练简单的任务，以高频率、低层次的“脊髓”网络与本体感觉运动神经元进行学习。这个预训练模块通过修正脊髓网络的输入来驱动行为，从而使学习从稀疏的奖励中得到有效的探索。在三种虚拟体内（16维游泳蛇、20维四足动物和54维人形），通过我们提出的架构进行测试并产生了明显的进展，详见附带的视频

可调步态控制器的学习和迁移