BriefGPT.xyz
Jun, 2024
黑盒预测优化的渐近最优遗憾
Asymptotically Optimal Regret for Black-Box Predict-then-Optimize
HTML
PDF
Samuel Tan, Peter I. Frazier
TL;DR
考虑了预测-优化模式的决策制定方法,通过在历史数据上训练监督学习模型,再利用该模型在新环境中进行未来的二进制决策以最大化预测奖励,提出了一种新的损失函数Empirical Soft Regret (ESR)来显著改善模型训练中的奖励,该方法在新闻推荐和个性化医疗决策问题上明显优于现有算法。
Abstract
We consider the
predict-then-optimize
paradigm for decision-making in which a practitioner (1) trains a
supervised learning model
on historical data of decisions, contexts, and rewards, and then (2) uses the resu
→