Theresa Eimer, André Biedenkapp, Frank Hutter, Marius Lindauer
TL;DR通过自适应学习实现自我生成任务课程,从而提高强化学习智能体的泛化能力并加速训练性能。
Abstract
reinforcement learning (RL) has made a lot of advances for solving a single
problem in a given environment; but learning policies that generalize to unseen
variations of a problem remains challenging. To improve sample efficiency for
learning on such instances of a problem domain, we p