Preliminary theoretical results on a feature relevance determination method for Generative Topographic Mapping

作者: Alfredo Vellido Alcacena

DOI:

关键词:

摘要: Feature selection (FS) has long been studied in classification and regression problems, following diverse approaches resulting on a wide variety of methods, usually grouped as either /filters /or /wrappers/. In comparison, FS for unsupervised learning received far less attention. For many real problems concerning multivariate data clustering, becomes an issue paramount importance results have to meet interpretability actionability requirements. A method Gaussian mixture models was recently defined Law et al. (2004). Mixture are well established clustering but their visualization capabilities limited. The Generative Topographic Mapping (Bishop 1998a), constrained distributions, originally overcome such limitation. this brief report we provide the theoretical development feature relevance determination Mapping, based that (2004); with method, can be visualized low dimensional latent space interpreted terms reduced subset selected relevant features. [This documend revised (8/11/2006)]

参考文章(9)
Geoffrey J. McLachlan, Christophe Ambroise, Kim-Anh Do, Analyzing Microarray Gene Expression Data ,(2004)
S. Kaski, T. Kohonen, Merja Oja, Bibliography of Self-Organizing Map SOM) Papers: 1998-2001 Addendum Neural Computing Surveys. ,vol. 3, pp. 1- 156 ,(2003)
J McLachlan, G, D. Peel, Finite Mixture Models ,(2000)
Edward I. George, The Variable Selection Problem Journal of the American Statistical Association. ,vol. 95, pp. 1304- 1308 ,(2000) , 10.1080/01621459.2000.10474336
Jack Sklansky, Mineichi Kudo, Comparison of algorithms that select features for pattern classifiers Pattern Recognition. ,vol. 33, pp. 25- 41 ,(2000) , 10.1016/S0031-3203(99)00041-2
A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum Likelihood from Incomplete Data Via theEMAlgorithm Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 39, pp. 1- 22 ,(1977) , 10.1111/J.2517-6161.1977.TB01600.X
Pak Chung Wong, Visual data mining IEEE Computer Graphics and Applications. ,vol. 19, pp. 20- 21 ,(1999) , 10.1109/MCG.1999.788794
D. Peel, G. J. McLachlan, Robust mixture modelling using the t distribution Statistics and Computing. ,vol. 10, pp. 339- 348 ,(2000) , 10.1023/A:1008981510081
Yuan Qi, Thomas P Minka, Rosalind W Picard, Zoubin Ghahramani, None, Predictive automatic relevance determination by expectation propagation Twenty-first international conference on Machine learning - ICML '04. pp. 85- ,(2004) , 10.1145/1015330.1015418