The softmax representation of probabilities for categorical variables plays a
prominent role in modern machine learning with numerous applications in areas
such as large scale classification, neural language modeling and recommendation
systems. However, softmax estimation is very expensive for large scale
inference because of the high cost associated with co