The effectiveness of machine learning algorithms arises from being able to extract useful features from large amounts of data. As model and dataset sizes increase, dataset distillation methods that compress large datasets into significantly smaller yet highly performant ones will become valuable in terms of training efficiency and useful feature extraction. To that end, we apply a novel distributed kernel based meta-learning framework to achieve state-of-the-art results for dataset distillation using infinitely wide convolutional neural networks. For instance, using only 10 datapoints (0.02% of original dataset), we obtain over 64% test accuracy on CIFAR-10 image classification task, a dramatic improvement over the previous best test accuracy of 40%. Our state-of-the-art results extend across many other settings for MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100, and SVHN. Furthermore, we perform some preliminary analyses of our distilled datasets to shed light on how they differ from naturally occurring data.

采用一种新的分布式基于核的元学习框架，使用无限宽的卷积神经网络，在数据集压缩中实现前沿的结果，通过对MNIST，Fashion-MNIST，CIFAR-10，CIFAR-100和SVHN等多个数据集的数据压缩进行初步分析，为数据如何与自然发生的数据不同提供了一些启示。

使用无限宽的卷积神经网络进行数据集蒸馏