聚类的表示学习：一个统计框架

Jun, 2015

Representation Learning for Clustering: A Statistical Framework

Hassan Ashtiani, Shai Ben-David

TL;DR本文提出一种协议，将用户提供的较小的数据样本进行聚类，并在此基础上建立一个数据表示方法，通过此方法学习聚类表征，并分析其统计样本复杂度，以及线性嵌入诱导的表征类的VC维，从而可以学习成功地学习具有有限VC维的表征类。

Abstract

We address the problem of communicating domain knowledge from a user to the designer of a clustering algorithm. We propose a protocol in which the user provides a clustering of a relatively small random sample of a data