Clustering algorithms: on learning, validation, performance, and applications to genomics.

作者： Lori Dalton , Virginia Ballarin , Marcel Brun

DOI: 10.2174/138920209789177601

关键词: Cluster analysis 、 DNA microarray 、 Microarray analysis techniques 、 Data mining 、 Profiling (information science) 、 Computer science 、 Image processing 、 Genomics 、 SIMPLE algorithm 、 Gene chip analysis

摘要: The development of microarray technology has enabled scientists to measure the expression thousands genes simultaneously, resulting in a surge interest several disciplines throughout biology and medicine. While data clustering been used for decades image processing pattern recognition, recent years it joined this wave activity as popular technique analyze microarrays. To illustrate its application genomics, applied from set groups together those whose levels exhibit similar behavior samples, when samples offers potential discriminate pathologies based on their differential patterns gene expression. Although now many context microarrays, remained highly problematic. choice algorithm validation index is not trivial one, more so applying them high throughput biological or medical data. Factors consider choosing an include nature application, characteristics objects be analyzed, expected number shape clusters, complexity problem versus computational power available. In some cases very simple may appropriate tackle problem, but situations require complex powerful better suited job at hand. paper, we will cover theoretical aspects clustering, including error learning, followed by overview algorithms classical indices. We also discuss relative performance these indices conclude with examples biology.

nih.gov 本地加速

eurekaselect.com 本地加速

europepmc.org 本地加速

sci-hub.se PDF 下载加速

参考文章(54)

Francisco Azuaje, Nadia Bolshakova, None, Clustering Genomic Expression Data: Design and Evaluation Principles Springer, Boston, MA. pp. 230- 245 ,(2003) , 10.1007/0-306-47815-3_13

Volker Roth, Tilman Lange, Mikio Braun, Joachim Buhmann, A Resampling Approach to Cluster Validation COMPSTAT. pp. 123- 128 ,(2002) , 10.1007/978-3-642-57489-4_13

László Györfi, Luc Devroye, Gábor Lugosi, A Probabilistic Theory of Pattern Recognition ,(1996)

M. Bittner, P. Meltzer, Y. Chen, Y. Jiang, E. Seftor, M. Hendrix, M. Radmacher, R. Simon, Z. Yakhini, A. Ben-Dor, N. Sampas, E. Dougherty, E. Wang, F. Marincola, C. Gooden, J. Lueders, A. Glatfelter, P. Pollock, J. Carpten, E. Gillanders, D. Leja, K. Dietrich, C. Beaudry, M. Berens, D. Alberts, V. Sondak, N. Hayward, J. Trent, Molecular classification of cutaneous malignant melanoma by gene expression profiling Nature. ,vol. 406, pp. 536- 540 ,(2000) , 10.1038/35020115

Ron Shamir, Roded Sharan, Center CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis intelligent systems in molecular biology. ,vol. 8, pp. 307- 316 ,(2000)

Richard C. Dubes, Anil K. Jain, Algorithms for clustering data ,(1988)

Peter J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis Journal of Computational and Applied Mathematics. ,vol. 20, pp. 53- 65 ,(1987) , 10.1016/0377-0427(87)90125-7

Petri Törönen, Mikko Kolehmainen, Garry Wong, Eero Castrén, Analysis of gene expression data using self‐organizing maps FEBS Letters. ,vol. 451, pp. 142- 146 ,(1999) , 10.1016/S0014-5793(99)00524-4

Lars Bullinger, Konstanze Döhner, Eric Bair, Stefan Fröhling, Richard F. Schlenk, Robert Tibshirani, Hartmut Döhner, Jonathan R. Pollack, Use of Gene-Expression Profiling to Identify Prognostic Subclasses in Adult Acute Myeloid Leukemia New England Journal of Medicine. ,vol. 350, pp. 1605- 1616 ,(2004) , 10.1056/NEJMOA031046

10.

A. K. Jain, M. N. Murty, P. J. Flynn, Data clustering: a review ACM Computing Surveys. ,vol. 31, pp. 264- 323 ,(1999) , 10.1145/331499.331504

Clustering algorithms: on learning, validation, performance, and applications to genomics.

来源期刊

我的账户

Clustering algorithms: on learning, validation, performance, and applications to genomics.

来源期刊

相似文章 10

我的账户