Prabhat Agarwal, Manisha Srivastava, Vishwakarma Singh, Charles Rosenberg
TL;DR本文提出了一种名为SEINE(Spam DEtection using Interaction NEtworks)的垃圾邮件检测模型,利用图形框架对海量数据进行学习,并考虑邻域与边缘的属性,以实现在大规模生产系统中应用,并在真实数据集上获得80%的回收率和1%的误报率。
Abstract
Spam is a serious problem plaguing web-scale digital platforms which facilitate user content creation and distribution. It compromises platform's integrity, performance of services like recommendation and search, and overall business. Spammers engage in a variety of abusive and evasive behavior which are distinct from non-spammers. Users' complex behavior ca