Stochastic optimization is a widely used approach for optimization under
uncertainty, where uncertain input parameters are modeled by random variables.
Exact or approximation algorithms have been obtained for several fundamental
problems in this area. However, a significant limitation of this approach is
that it requires full knowledge of the underlying probability distributions.
Can we still get good (approximation) algorithms if these distributions are
unknown, and the algorithm needs to learn them through repeated interactions?
In this paper, we resolve this question for a large class of "monotone"
stochastic problems, by providing a generic online learning algorithm with
$\sqrt{T \log T}$ regret relative to the best approximation algorithm (under
known distributions). Importantly, our online algorithm works in a semi-bandit
setting, where in each period, the algorithm only observes samples from the
r.v.s that were actually probed. Our framework applies to several fundamental
problems in stochastic optimization such as prophet inequality, Pandora's box,
stochastic knapsack, stochastic matchings and stochastic submodular
optimization.

通过提供一种具有与最佳近似算法（在已知分布下）相对于平方根的 T 乘以 log T 束缚的通用在线学习算法，在半探测器环境中解决了在一大类 “单调” 随机问题中对于未知分布是否能够获得良好（近似）算法进行学习的问题。我们的框架适用于随机优化的若干基本问题，如先知不等式、潘多拉盒、随机背包、随机匹配和随机次模优化。