BriefGPT.xyz
May, 2024
基于梯度的神经网络芯片上每权重混合精度量化
Gradient-based Automatic Per-Weight Mixed Precision Quantization for Neural Networks On-Chip
HTML
PDF
Chang Sun, Thea K. Årrestad, Vladimir Loncar, Jennifer Ngadiuba, Maria Spiropulu
TL;DR
通过高精度量化训练方法,减少模型大小和推理速度,提高 FPGA 部署的低延迟和低功耗神经网络的资源利用率,同时保持准确性。
Abstract
model size
and
inference speed
at deployment time, are major challenges in many deep learning applications. A promising strategy to overcome these challenges is
→