Gradient boosting is a state-of-the-art prediction technique that sequentially produces a model in the form of linear combinations of simple predictors---typically decision trees---by solving an infinite-dimensional convex optimization problem. We provide in the present paper a thorough analysis of two widespread versions of gradient boosting, and introduce a general framework for studying these algorithms from the point of view of functional optimization. We prove their convergence as the number of iterations tends to infinity and highlight the importance of having a strongly convex risk functional to minimize. We also present a reasonable statistical context ensuring consistency properties of the boosting predictors as the sample size grows. In our approach, the optimization procedures are run forever (that is, without resorting to an early stopping strategy), and statistical regularization is basically achieved via an appropriate $L^2$ penalization of the loss and strong convexity arguments.

本文针对一种最先进的预测技术——梯度提升方法，通过解决无限维凸优化问题，顺序地生成一个由简单预测器（通常为决策树）的线性组合构成的模型。我们对两个广泛使用的梯度提升版本进行了彻底的分析，并从函数优化的角度引入了一个通用框架来研究这些算法。证明了它们在迭代次数趋近于无穷时的收敛性，并强调了具有强凸风险函数的重要性。我们还提供了一个合理的统计环境，确保在样本大小增长时提高了预测器的一致性。在我们的方法中，优化程序是无限运行的（也就是说，没有采用早期停止策略），并且通过适当的$L^2$损失惩罚和强凸性论证来实现统计正则化。

梯度提升优化