BriefGPT.xyz
Jun, 2021
为高效的基于人口的自动强化学习在线调节混合输入超参数
Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL
HTML
PDF
Jack Parker-Holder, Vu Nguyen, Shaan Desai, Stephen Roberts
TL;DR
本文介绍了一种新的自动化强化学习算法,使用一种特定的时间变化bandit算法来优化持续性和类别性变量的集成,提高了Procgen基准测试的泛化性能。
Abstract
Despite a series of recent successes in
reinforcement learning
(RL), many RL algorithms remain sensitive to
hyperparameters
. As such, there has recently been interest in the field of
→