TL;DR提出了一种改进的朴素贝叶斯文本分类方法,通过使用 EM 算法迭代优化对应的对数似然函数并明确不正确标签的生成机制,大大提高了带有错误标签数据的朴素贝叶斯方法的性能表现。
Abstract
labeling mistakes are frequently encountered in real-world applications. If
not treated well, the labeling mistakes can deteriorate the classification
performances of a model seriously. To address this issue, we