He Zhao, Dinh Phung, Viet Huynh, Trung Le, Wray Buntine
TL;DR通过最优传输理论,提出一种新的神经主题模型来更好地实现文档表示和一致 / 多样化的主题,特别地,通过最小化文档的 OT 距离来学习文档的主题分布,实验证明该模型在常规和短文本的文本分析中表现显著优于现有的神经主题模型。
Abstract
Recently, neural topic models (NTMs) inspired by variational autoencoders
have obtained increasingly research interest due to their promising results on
text analysis. However, it is usually hard for existing NTM
EdTM is a label name supervised topic modeling approach that incorporates analysts' understanding of the corpus using LM/LLM based document-topic affinities and optimal transport for making globally coherent topic assignments.