Random projections for assessing gene expression cluster stability

作者: A. Bertoni , G. Valentini

DOI: 10.1109/IJCNN.2005.1555821

关键词:

摘要: Clustering analysis of gene expression is characterized by the very high dimensionality and low cardinality data, two important related topics are validation estimate number obtained clusters. In this paper we focus on stability Our approach to problem based random projections obeying Johnson-Lindenstrauss lemma, which data may be projected into randomly selected dimensional suhspaces, approximately preserving pairwise distances between examples. We experiment with different types projections, comparing empirical theoretical distortions induced randomized embeddings Euclidean metric spaces, present cluster-stability measures that used validate quantitatively assess reliability clusters a large class clustering algorithms. Experimental results synthetic DNA microarray show effectiveness proposed approach.

参考文章(2)
Joe H. Ward, Hierarchical Grouping to Optimize an Objective Function Journal of the American Statistical Association. ,vol. 58, pp. 236- 244 ,(1963) , 10.1080/01621459.1963.10500845
Tin Kam Ho, The random subspace method for constructing decision forests IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 20, pp. 832- 844 ,(1998) , 10.1109/34.709601