BriefGPT.xyz
Apr, 2023
DeepGEMM: 使用查找表在 CPU 结构上加速的超低精度推断
DeepGEMM: Accelerated Ultra Low-Precision Inference on CPU Architectures using Lookup Tables
HTML
PDF
Darshan C. Ganji, Saad Ashfaq, Ehsan Saboori, Sudhakar Sah, Saptarshi Mitra...
TL;DR
通过建立查找表并在推理时高效地访问它们,DeepGEMM可以在SIMD硬件上执行超低精度卷积神经网络,比现有框架中的对应8位整数核心性能提高了高达1.74倍。
Abstract
A lot of recent progress has been made in
ultra low-bit quantization
, promising significant improvements in latency, memory footprint and energy consumption on edge devices. Quantization methods such as
learned step siz
→