Clustering Time-Series Gene Expression Data with Unequal Time Intervals

作者: Luis Rueda , Ataul Bari , Alioune Ngom

DOI: 10.1007/978-3-540-92273-5_6

关键词: Data miningClustering high-dimensional dataHierarchical clusteringFuzzy clusteringCorrelation clusteringSingle-linkage clusteringComputer scienceCluster analysisCURE data clustering algorithmBiclustering

摘要: Clustering gene expression data given in terms of time-series is a challenging problem that imposes its own particular constraints, namely exchanging two or more time points not possible as it would deliver quite different results, and also lead to erroneous biological conclusions. We have focused on issues related clustering temporal profiles, devised novel algorithm for profile microarray data. The proposed method introduces the concept alignment which achieved by minimizing area between aligned profiles. overall pattern context accomplished applying agglomerative combined with alignment, finding optimal number clusters means variant index, can effectively decide upon dataset. effectiveness approach demonstrated well-known datasets, yeast serum, corroborated set pre-clustered genes, show very high classification accuracy method, though an unsupervised scheme.

参考文章(38)
Ataul Bari, Luis Rueda, A New Profile Alignment Method for Clustering Gene Expression Data Advances in Artificial Intelligence. pp. 86- 97 ,(2006) , 10.1007/11766247_8
Laurent Bréhélin, Clustering gene expression series with prior knowledge workshop on algorithms in bioinformatics. pp. 27- 38 ,(2005) , 10.1007/11557067_3
Youyong Zhu, Hairu Chen, Jinghua Fan, Yunyue Wang, Yan Li, Jianbing Chen, JinXiang Fan, Shisheng Yang, Lingping Hu, Hei Leung, Tom W. Mew, Paul S. Teng, Zonghua Wang, Christopher C. Mundt, Genetic diversity and disease control in rice Nature. ,vol. 406, pp. 718- 722 ,(2000) , 10.1038/35021046
Gefeng Zhu, Paul T. Spellman, Tom Volpe, Patrick O. Brown, David Botstein, Trisha N. Davis, Bruce Futcher, Two yeast forkhead genes regulate the cell cycle and pseudohyphal growth Nature. ,vol. 406, pp. 90- 94 ,(2000) , 10.1038/35017581
Ziv Bar-Joseph, Georg K. Gerber, David K. Gifford, Tommi S. Jaakkola, Itamar Simon, Continuous Representations of Time-Series Gene Expression Data Journal of Computational Biology. ,vol. 10, pp. 341- 356 ,(2003) , 10.1089/10665270360688057
S. Déjean, P. G. P. Martin, A. Baccini, P. Besse, Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives Eurasip Journal on Bioinformatics and Systems Biology. ,vol. 2007, pp. 70561- 70561 ,(2007) , 10.1155/2007/70561
Laurie J Heyer, Semyon Kruglyak, Shibu Yooseph, Exploring Expression Data: Identification and Analysis of Coexpressed Genes Genome Research. ,vol. 9, pp. 1106- 1115 ,(1999) , 10.1101/GR.9.11.1106
Wilbert H.M Heijne, Rob H Stierum, Monique Slijper, Peter J van Bladeren, Ben van Ommen, Toxicogenomics of bromobenzene hepatotoxicity: a combined transcriptomics and proteomics approach Biochemical Pharmacology. ,vol. 65, pp. 857- 875 ,(2003) , 10.1016/S0006-2952(02)01613-1
Shyamal D. Peddada, Katherine E. Prescott, Mark Conaway, Tests for order restrictions in binary data. Biometrics. ,vol. 57, pp. 1219- 1227 ,(2001) , 10.1111/J.0006-341X.2001.01219.X