Combining empirical risk minimization with capacity control is a classical strategy in machine learning when trying to control the generalization gap and avoid overfitting, as the model class capacity gets larger. Yet, in modern deep learning practice, very large over-parameterized models (e.g. neural networks) are optimized to fit perfectly the training data and still obtain great generalization performance. Past the interpolation point, increasing model complexity seems to actually lower the test error. In this tutorial, we explain the concept of double descent and its mechanisms. The first section sets the classical statistical learning framework and introduces the double descent phenomenon. By looking at a number of examples, section 2 introduces inductive biases that appear to have a key role in double descent by selecting, among the multiple interpolating solutions, a smooth empirical risk minimizer. Finally, section 3 explores the double descent with two linear models, and gives other points of view from recent related works.

将经验风险最小化与容量控制结合是机器学习中控制泛化差距和避免过拟合的经典策略。然而，在现代深度学习实践中，非常庞大的超参数化模型（例如神经网络）被优化以完美拟合训练数据，并且仍然具有出色的泛化性能。在插值点之后，增加模型复杂性似乎实际上降低了测试误差。本教程解释了双重下降的概念及其机制，并引入了具有关键作用的归纳偏差，通过选择一种平滑的经验风险最小化器，从多个插值解决方案中选择一个。最后，第三部分探讨了两个线性模型中的双重下降，并从最近相关的工作提供了其他视角。

深度学习中的双下降现象理解