minwise hashing (MinHash) is a standard algorithm widely used in the
industry, for large-scale search and learning applications with the binary
(0/1) Jaccard similarity. One common use of MinHash is for processing massive
n-gram text representations so that practitioners do not have to