Clustering microarray time-series data using expectation maximization and multiple profile alignment

作者: Numanul Subhani , Luis Rueda , Alioune Ngom , Conrad J. Burden

DOI: 10.1109/BIBMW.2009.5332128

关键词:

摘要: A common problem in biology is to partition a set of experimental data into clusters such way that the points within same cluster are highly similar while different very different. In this direction, clustering microarray time-series via pairwise alignment piece-wise linear profiles has been recently introduced. We propose EM approach based on multiple natural cubic spline representations gene expression profiles. The achieved by minimizing sum integrated squared errors over time-interval, defined Preliminary experiments well-known 221 pre-clustered Saccharomyces cerevisiae yield encouraging results with 83.26% accuracy.

参考文章(17)
Luis Rueda, Ataul Bari, Alioune Ngom, Clustering Time-Series Gene Expression Data with Unequal Time Intervals Transactions on Computational Systems Biology X. ,vol. 10, pp. 100- 123 ,(2008) , 10.1007/978-3-540-92273-5_6
Laurent Bréhélin, Clustering gene expression series with prior knowledge workshop on algorithms in bioinformatics. pp. 27- 38 ,(2005) , 10.1007/11557067_3
Numanul Subhani, Alioune Ngom, Luis Rueda, Conrad Burden, Microarray Time-Series Data Clustering via Multiple Alignment of Gene Expression Profiles pattern recognition in bioinformatics. ,vol. 5780, pp. 377- 390 ,(2009) , 10.1007/978-3-642-04031-3_33
Ziv Bar-Joseph, Georg K. Gerber, David K. Gifford, Tommi S. Jaakkola, Itamar Simon, Continuous Representations of Time-Series Gene Expression Data Journal of Computational Biology. ,vol. 10, pp. 341- 356 ,(2003) , 10.1089/10665270360688057
S. Déjean, P. G. P. Martin, A. Baccini, P. Besse, Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives Eurasip Journal on Bioinformatics and Systems Biology. ,vol. 2007, pp. 70561- 70561 ,(2007) , 10.1155/2007/70561
Laurie J Heyer, Semyon Kruglyak, Shibu Yooseph, Exploring Expression Data: Identification and Analysis of Coexpressed Genes Genome Research. ,vol. 9, pp. 1106- 1115 ,(1999) , 10.1101/GR.9.11.1106
M. F. Ramoni, P. Sebastiani, I. S. Kohane, Cluster analysis of gene expression dynamics. Proceedings of the National Academy of Sciences of the United States of America. ,vol. 99, pp. 9121- 9126 ,(2002) , 10.1073/PNAS.132656399
A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum Likelihood from Incomplete Data Via theEMAlgorithm Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 39, pp. 1- 22 ,(1977) , 10.1111/J.2517-6161.1977.TB01600.X
C.S. Möller-Levet, F. Klawonn, K.-H. Cho, H. Yin, O. Wolkenhauer, Clustering of unevenly sampled gene expression time-series data Fuzzy Sets and Systems. ,vol. 152, pp. 49- 66 ,(2005) , 10.1016/J.FSS.2004.10.014
Saeed Tavazoie, Jason D. Hughes, Michael J. Campbell, Raymond J. Cho, George M. Church, Systematic determination of genetic network architecture Nature Genetics. ,vol. 22, pp. 281- 285 ,(1999) , 10.1038/10343