BriefGPT.xyz
Mar, 2020
LSF-Join: 基于局部敏感过滤的偏斜分布下分布式全对集合相似性算法
LSF-Join: Locality Sensitive Filtering for Distributed All-Pairs Set Similarity Under Skew
HTML
PDF
Cyrus Rashtchian, Aneesh Sharma, David P. Woodruff
TL;DR
提出一种基于局部敏感过滤的随机选择算法LSF-Join,可以高效地在大数据集上以近似的方式查找所有匹配对,特别适用于高维数据集,解决了以往算法在大规模数据上无法适用的问题。
Abstract
All-pairs
set similarity
is a widely used
data mining
task, even for large and high-dimensional datasets. Traditionally, similarity search has focused on discovering very similar pairs, for which a variety of eff
→