Visualisation of heterogeneous data with simultaneous feature saliency using Generalised Generative Topographic Mapping

作者: Ian T. Nabney , Shahzad Mumtaz , Gurjinder Bassi , Michel F. Randrianandrasana

DOI:

关键词: Latent variableProbabilistic logicComputer scienceData miningGenerative topographic mappingPattern recognitionSingle typeArtificial intelligenceParameter learningFeature saliencyVisualizationConditional independence

摘要: Most machine-learning algorithms are designed for datasets with features of a single type whereas very little attention has been given to mixed-type features. We recently proposed model handle mixed types probabilistic latent variable formalism. This describes the data by type-specific distributions that conditionally independent space and is called generalised generative topographic mapping (GGTM). It often observed visualisations high-dimensional can be poor in presence noisy In this paper we therefore propose extend GGTM estimate feature saliency values (GGTMFS) as an integrated part parameter learning process expectation-maximisation (EM) algorithm. The efficacy GGTMFS demonstrated both synthetic real datasets.

参考文章(17)
Shahzad Mumtaz, Visualisation of bioinformatics datasets Aston University. ,(2015)
Michel F. Randrianandrasana, Shahzad Mumtaz, Ian T. Nabney, Visualisation of heterogeneous data with the generalised generative topographic mapping international conference on information visualization theory and applications. pp. 233- 238 ,(2015) , 10.5220/0005305002330238
Alfredo Vellido, Assessment of an Unsupervised Feature Selection Method for Generative Topographic Mapping Artificial Neural Networks – ICANN 2006. pp. 361- 370 ,(2006) , 10.1007/11840930_37
Xin Wang, Ata Kabán, Finding uninformative features in binary data intelligent data engineering and automated learning. pp. 40- 47 ,(2005) , 10.1007/11508069_6
Cláudia Silvestre, Margarida Cardoso, Mário Figueiredo, Clustering and selecting categorical features portuguese conference on artificial intelligence. ,vol. 8154, pp. 331- 342 ,(2013) , 10.1007/978-3-642-40669-0_29
Christopher M Bishop, Markus Svensén, Christopher KI Williams, Magnification factors for the GTM algorithm Fifth International Conference on Artificial Neural Networks. pp. 64- 69 ,(1997) , 10.1049/CP:19970703
Alfredo Vellido, Paulo J.G. Lisboa, Dolores Vicente, Robust analysis of MRS brain tumour data using t-GTM Neurocomputing. ,vol. 69, pp. 754- 768 ,(2006) , 10.1016/J.NEUCOM.2005.12.005
Nizar Bouguila, On multivariate binary data clustering and feature weighting Computational Statistics & Data Analysis. ,vol. 54, pp. 120- 134 ,(2010) , 10.1016/J.CSDA.2009.07.013
A. Kaban, M. Girolami, A combined latent class and trait model for the analysis and visualization of discrete data IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 23, pp. 859- 872 ,(2001) , 10.1109/34.946989