Graph-based consensus clustering for class discovery from gene expression data

作者: Zhiwen Yu , Hau-San Wong , Hongqiang Wang

DOI: 10.1093/BIOINFORMATICS/BTM463

关键词:

摘要: Motivation: Consensus clustering, also known as cluster ensemble, is one of the important techniques for microarray data analysis, and particularly useful class discovery from data. Compared with traditional clustering algorithms, consensus approaches have ability to integrate multiple partitions different solutions improve robustness, stability, scalability parallelization algorithms. By can discover underlying classes samples in gene expression data. Results: In addition exploring a graph-based (GCC) algorithm estimate data, we design new validation index determine number To our knowledge, this first time which GCC applied Given pre specified maximum (denoted Kmax article), true according called Modified Rand Index. Experiments on indicate that (i) outperform most existing (ii) identify correctly real cancer datasets, (iii) biological meaning. Availability: Matlab source code available upon request Zhiwen Yu. Contact:yuzhiwen@cs.cityu.edu.hk cshswong@cityu.edu.hk Supplementary information: Supplementary are at Bioinformatics online.

参考文章(31)
Alberto Bertoni, Giorgio Valentini, Randomized Embedding Cluster ensembles for gene expression data analysis SETIT 2007 - IEEE International Conf. on Sciences of Electronic, Technologies of Information and Telecommunications. ,(2007)
Stefano Monti, Pablo Tamayo, Jill Mesirov, Todd Golub, Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data Machine Learning. ,vol. 52, pp. 91- 118 ,(2003) , 10.1023/A:1023949509487
Igor Jurisica, Melania Pintilie, James Woodgett, Mike Tyers, Frances A. Shepherd, Timothy Winton, Paul Jorgenson, Bobby-Joe Breitkreutz, Ming Sound Tsao, Shaf Keshavjee, Ni Liu, Michael Johnston, Chao Lu, Janet Rossant, Gail Darling, Isolde Seiden, Dennis A. Wigle, Niki Radulovich, Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. Cancer Research. ,vol. 62, pp. 3005- 3008 ,(2002)
Javed Khan, Jun S Wei, Markus Ringner, Lao H Saal, Marc Ladanyi, Frank Westermann, Frank Berthold, Manfred Schwab, Cristina R Antonescu, Carsten Peterson, Paul S Meltzer, None, Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks Nature Medicine. ,vol. 7, pp. 673- 679 ,(2001) , 10.1038/89044
Mark Smolkin, Debashis Ghosh, Cluster stability scores for microarray data in cancer studies. BMC Bioinformatics. ,vol. 4, pp. 36- 36 ,(2003) , 10.1186/1471-2105-4-36
Alberto Bertoni, Giorgio Valentini, Model order selection for bio-molecular data clustering. BMC Bioinformatics. ,vol. 8, pp. 1- 13 ,(2007) , 10.1186/1471-2105-8-S2-S7
Alberto Bertoni, Giorgio Valentini, Randomized maps for assessing the reliability of patients clusters in DNA microarray data analyses Artificial Intelligence in Medicine. ,vol. 37, pp. 85- 109 ,(2006) , 10.1016/J.ARTMED.2006.03.005