We present the first systematic investigation of supervised scaling laws outside of an ImageNet-like context - on images of galaxies. We use 840k galaxy images and over 100M annotations by Galaxy Zoo volunteers, comparable in scale to Imagenet-1K. We find that adding annotated galaxy images provides a power law improvement in performance across all architectures and all tasks, while adding trainable parameters is effective only for some (typically more subjectively challenging) tasks. We then compare the downstream performance of finetuned models pretrained on either ImageNet-12k alone vs. additionally pretrained on our galaxy images. We achieve an average relative error rate reduction of 31% across 5 downstream tasks of scientific interest. Our finetuned models are more label-efficient and, unlike their ImageNet-12k-pretrained equivalents, often achieve linear transfer performance equal to that of end-to-end finetuning. We find relatively modest additional downstream benefits from scaling model size, implying that scaling alone is not sufficient to address our domain gap, and suggest that practitioners with qualitatively different images might benefit more from in-domain adaption followed by targeted downstream labelling.

我们首次系统研究了在星系图像上监督扩展定律的问题。使用了840k个星系图像和超过1亿个由Galaxy Zoo志愿者注释的图像，与Imagenet-1K的规模相当。我们发现添加注释的星系图像可以在所有架构和所有任务上提高性能，而添加可训练参数仅对某些任务有效。我们比较了仅在ImageNet-12k上进行预训练和在我们的星系图像上额外进行预训练的模型的下游性能。在5个科学相关的下游任务上，我们实现了平均相对误差率降低31％的结果。我们的模型对标签的利用效率更高，而且通常实现了端到端微调的线性转移性能，与仅在ImageNet-12k上预训练的模型不同。我们发现仅通过扩展模型尺寸获得的额外下游效益相对较小，这意味着单纯的扩展不足以解决我们的领域差距问题，并建议具有定性不同的图像的从业者在领域内适应之后进行有针对性的下游分类。

星系图像的尺度律