Model-based clustering of high-dimensional data streams with online mixture of probabilistic PCA

作者: Anastasios Bellas , Charles Bouveyron , Marie Cottrell , Jérôme Lacaille

DOI: 10.1007/S11634-013-0133-7

关键词:

摘要: Model-based clustering is a popular tool which renowned for its probabilistic foundations and flexibility. However, model-based techniques usually perform poorly when dealing with high-dimensional data streams, are nowadays frequent type. To overcome this limitation of clustering, we propose an online inference algorithm the mixture PCA model. The proposed relies on EM-based procedure incremental version PCA. Model selection also considered in setting through parallel computing. Numerical experiments simulated real demonstrate effectiveness our approach compare it to state-of-the-art algorithms.

参考文章(45)
Irini Moustaki, David J. Bartholomew, Martin Knott, Latent Variable Models and Factor Analysis: A Unified Approach ,(2011)
Bruce G. Lindsay, Mixture models : theory, geometry, and applications Institute of Mathematical Statistics , American Statistical Association. ,(1995) , 10.1214/CBMS/1462106013
Geoff Hulten, Pedro Domingos, A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering international conference on machine learning. pp. 106- 113 ,(2001)
T Robinson, Y Hicks, P M Hall, A method to add Gaussian mixture models Department of Computer Science, University of Bath. ,(2004)
J McLachlan, G, D. Peel, Finite Mixture Models ,(2000)
Zoubin Ghahramani, Geoffrey E Hinton, None, The EM algorithm for mixtures of factor analyzers University of Toronto: Department of Computer Science. ,(1996)
Olivier Cappé, Eric Moulines, On‐line expectation–maximization algorithm for latent data models Journal of The Royal Statistical Society Series B-statistical Methodology. ,vol. 71, pp. 593- 613 ,(2009) , 10.1111/J.1467-9868.2009.00698.X
L. O'Callaghan, N. Mishra, A. Meyerson, S. Guha, R. Motwani, Streaming-data algorithms for high-quality clustering international conference on data engineering. pp. 685- 694 ,(2002) , 10.1109/ICDE.2002.994785
Allou Samé, Christophe Ambroise, Gérard Govaert, An online classification EM algorithm based on the mixture model Statistics and Computing. ,vol. 17, pp. 209- 218 ,(2007) , 10.1007/S11222-007-9017-Z
Chris Fraley, Adrian E Raftery, Model-Based Clustering, Discriminant Analysis, and Density Estimation Journal of the American Statistical Association. ,vol. 97, pp. 611- 631 ,(2002) , 10.1198/016214502760047131