Reducing the number of bits needed to encode the weights and activations of
neural networks is highly desirable as it speeds up their training and
inference time while reducing memory consumption. For these reasons, research
in this area has attracted significant attention toward devel