Neural scaling laws characterize how model performance improves as the model size scales up. Inspired by empirical observations, we introduce a resource model of neural scaling. A task is usually composite hence can be decomposed into many subtasks, which compete for resources (measured by the number of neurons allocated to subtasks). On toy problems, we empirically find that: (1) The loss of a subtask is inversely proportional to its allocated neurons. (2) When multiple subtasks are present in a composite task, the resources acquired by each subtask uniformly grow as models get larger, keeping the ratios of acquired resources constants. We hypothesize these findings to be generally true and build a model to predict neural scaling laws for general composite tasks, which successfully replicates the neural scaling law of Chinchilla models reported in arXiv:2203.15556. We believe that the notion of resource used in this paper will be a useful tool for characterizing and diagnosing neural networks.

神经缩放定律表征了模型性能随模型规模增加的改善情况。我们提出了一个神经缩放的资源模型，通过将任务分解为多个子任务并为其分配神经元资源（以分配给子任务的神经元数量表示），我们在玩具问题上经验证实以下实验发现：（1）子任务的损失与其分配的神经元成反比。（2）当复合任务中存在多个子任务时，随着模型规模的增大，每个子任务获得的资源均匀增长，保持获得资源的比例不变。我们假设这些发现是普遍存在的，并建立了一个模型来预测一般复合任务的神经缩放定律，成功复现了arXiv：2203.15556中报告的Chinchilla模型的神经缩放定律。我们相信本文提出的资源概念将成为表征和诊断神经网络的有用工具。

神经比例定律的资源模型