Jul, 2021
高效的一阶上下文臂状多臂老虎机:预测、分配和三角矩阵判别
Efficient First-Order Contextual Bandits: Prediction, Allocation, and
Triangular Discrimination
TL;DR本文探讨了如何在low noise的情况下, 通过logarithmic loss和triangular discrimination达到contextual bandits问题中的first-order guarantees,取得了很好的效果和结果