Automatic Model Selection in Archetype Analysis

作者: Sandhya Prabhakaran , Sudhir Raman , Julia E. Vogt , Volker Roth

DOI: 10.1007/978-3-642-32717-9_46

关键词:

摘要: Archetype analysis involves the identification of representative objects from amongst a set multivariate data such that can be expressed as convex combination these objects. Existing methods for archetype assume fixed number archetypes priori. Multiple runs different choices are required model selection. Not only is this computationally infeasible larger datasets, in heavy-noise settings selection becomes cumbersome. In paper, we present novel extension to existing with specific focus relaxing need provide beforehand. Our fast iterative optimization algorithm devised automatically select right using BIC scores and easily scaled noisy, large datasets. These benefits achieved by introducing Group-Lasso component popular sparse linear regression. The usefulness approach demonstrated through simulations on real world application document identifying topics.

参考文章(10)
Christian Bauckhage, Christian Thurau, Making Archetypal Analysis Practical joint pattern recognition symposium. ,vol. 5748, pp. 272- 281 ,(2009) , 10.1007/978-3-642-03798-6_28
Volker Roth, Bernd Fischer, The Group-Lasso for generalized linear models Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 848- 855 ,(2008) , 10.1145/1390156.1390263
Huan Xu, Constantine Caramanis, Sujay Sanghavi, Robust PCA via Outlier Pursuit IEEE Transactions on Information Theory. ,vol. 58, pp. 3047- 3064 ,(2012) , 10.1109/TIT.2011.2173156
Morten Mørup, Lars Kai Hansen, Archetypal analysis for machine learning and data mining Neurocomputing. ,vol. 80, pp. 54- 63 ,(2012) , 10.1016/J.NEUCOM.2011.06.033
Ben HP Chan, Daniel A Mitchell, Lawrence E Cram, Archetypal analysis of galaxy spectra Monthly Notices of the Royal Astronomical Society. ,vol. 338, pp. 790- 795 ,(2003) , 10.1046/J.1365-8711.2003.06099.X
Ming Yuan, Yi Lin, Model selection and estimation in regression with grouped variables Journal of The Royal Statistical Society Series B-statistical Methodology. ,vol. 68, pp. 49- 67 ,(2006) , 10.1111/J.1467-9868.2005.00532.X
Peter Huggins, Lior Pachter, Bernd Sturmfels, Toward the Human Genotope Bulletin of Mathematical Biology. ,vol. 69, pp. 2723- 2735 ,(2007) , 10.1007/S11538-007-9244-7
Yiming Yang, Fan Li, David D. Lewis, Tony G. Rose, RCV1: A New Benchmark Collection for Text Categorization Research Journal of Machine Learning Research. ,vol. 5, pp. 361- 397 ,(2004) , 10.5555/1005332.1005345
Steven S. Skiena, The Algorithm Design Manual ,(1980)
Trevor Hastie, Jonathan Taylor, Robert Tibshirani, Guenther Walther, Forward Stagewise Regression and the Monotone Lasso Electronic Journal of Statistics. ,vol. 1, pp. 1- 29 ,(2007) , 10.1214/07-EJS004