基于区域嵌入的半监督卷积神经网络文本分类

Apr, 2015

基于区域嵌入的半监督卷积神经网络文本分类

Semi-Supervised Learning with Multi-View Embedding: Theory and Application with Convolutional Neural Networks

Rie Johnson, Tong Zhang

TL;DR本文提出了一种新的半监督框架，利用卷积神经网络(CNNs)进行文本分类。与以往的方法依赖于词嵌入不同，我们的方法从未标记的数据中学习小文本区域的嵌入，并将其整合到受监督的CNN中。我们的模型在情感分类和主题分类任务上比以前的方法取得更好的结果。

Abstract

This paper presents a theoretical analysis of multi-view embedding -- feature embedding that can be learned from unlabeled data through the task of predicting one view from another. We prove its usefulness in supervised learning under certain conditions. The result explains the effectiveness of some existing methods such as word embedding. Based on this theo