Databases often contain corrupted, degraded, and noisy data with duplicate
entries across and within each database. Such problems arise in citations,
medical databases, genetics, human rights databases, and a variety of other
applied settings. The target of statistical inference can be viewed as an
unsupervised problem of determining the edges of a bipartite