TL;DR本文提出了一种快速 NPU 感知的 NAS 方法 S3NAS,通过超网设计、单路径 NAS 和扩展这三步实现在给定延迟限制下寻找具有更高准确性的 CNN 结构,通过该方法,在 3 小时内使用 TPUv3 找到了一种具有 82.72% Top-1 准确度和 11.66 毫秒延迟的网络。
Abstract
As the application area of convolutional neural networks (CNN) is growing in embedded devices, it becomes popular to use a hardware CNN accelerator, called neural processing unit (NPU), to achieve higher performa