Rituraj Kaushik, Pierre Desreumaux, Jean-Baptiste Mouret
TL;DR本文提出了基于曲库的在线学习方法,通过与不同情境下的行为库匹配确定最优策略,取得了比 Reset-free Trial and Error 及其他传统的单曲库方法更快更高效的学习效果,并在机器人的编程中得到了实际应用。
Abstract
Among the data-efficient approaches for online adaptation in robotics (meta-learning, model-based reinforcement learning, etc.), repertoire-based learning (1) generates a large and diverse set policies in simulation that acts as a "reservoir" for future adaptations and (2) learns to pi