Implicit regularization in deep learning is perceived as a tendency of gradient-based optimization to fit training data with predictors of minimal "complexity." The fact that only some types of data give rise to generalization is understood to result from them being especially amenable to fitting with low complexity predictors. A major challenge towards formalizing this intuition is to define complexity measures that are quantitative yet capture the essence of data that admits generalization. With an eye towards this challenge, we analyze the implicit regularization in tensor factorization, equivalent to a certain non-linear neural network. We characterize the dynamics that gradient descent induces on the factorization, and establish a bias towards low tensor rank, in compliance with existing empirical evidence. Then, motivated by tensor rank capturing implicit regularization of a non-linear neural network, we empirically explore it as a measure of complexity, and find that it stays extremely low when fitting standard datasets. This leads us to believe that tensor rank may pave way to explaining both implicit regularization of neural networks, and the properties of real-world data translating it to generalization.

采用动力学系统视角和贪心低秩张量搜索方法，我们得出了张量秩作为衡量复杂度和深度神经网络隐式正则化的方法，进而解释了深度学习中的隐式正则化和现实世界数据的性质对泛化的影响。

张量分解中的隐式正则化