BriefGPT.xyz
Apr, 2022
遗传汤普森抽样的进化多臂老虎机
Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling
HTML
PDF
Baihan Lin
TL;DR
提出一种基于遗传算法的多臂赌博机算法来改善在线学习中的序列决策问题,并通过多臂赌博机仿真环境和实际流行病控制问题的实验结果显示,该方法显著优于基准算法,并介绍了EvoBandit,一个基于Web的交互式可视化方案来指导读者进行整个学习过程并进行轻量级评估。
Abstract
As two popular schools of machine learning,
online learning
and
evolutionary computations
have become two important driving forces behind real-world decision making engines for applications in biomedicine, econom
→