作者: Barnabas Poczos , Liang Xiong , Jeff Schneider
DOI:
关键词:
摘要: Low-dimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. Here we consider setting where each instance of inputs corresponds to a continuous probability distribution. These distributions unknown us, but given some i.i.d. samples from them. While existing learning methods operate on points, i.e. finite-dimensional feature vectors, our study algorithms that groups, sets vectors. For this purpose, propose new nonparametric, consistent estimators for large family divergences describe how apply them problems. As special cases, can be used estimate Renyi, Tsallis, Kullback-Leibler, Hellinger, Bhattacharyya distance, L2 divergences, mutual information. We present empirical results synthetic data, real word images, astronomical data sets.