BriefGPT.xyz
Nov, 2019
带附加信息的安全线性汤普森抽样
Safe Linear Thompson Sampling
HTML
PDF
Ahmadreza Moradipari, Sanae Amani, Mahnoosh Alizadeh, Christos Thrampoulidis
TL;DR
本文针对线性随机赌博机问题提出一种基于线性Thompson抽样的新型安全算法,通过引入线性安全约束,在与没有安全约束的情况下,展示了使得机器人有更好的性能表现的结果,并将其与基于UCB算法的安全算法进行了比较。
Abstract
The design and
performance analysis
of
bandit algorithms
in the presence of stage-wise safety or reliability constraints has recently garnered significant interest. In this work, we consider the linear
→