作者: Pablo A. Jaskowiak , Ricardo J.G.B. Campello , Ivan G. Costa
DOI: 10.1109/TCBB.2013.9
关键词: Microarray analysis techniques 、 Mathematics 、 Data mining 、 Spearman's rank correlation coefficient 、 Benchmark (computing) 、 Cluster analysis 、 Gene expression profiling 、 Similarity (network science) 、 Euclidean distance 、 Pearson product-moment correlation coefficient
摘要: Cluster analysis is usually the first step adopted to unveil information from gene expression microarray data. Besides selecting a clustering algorithm, choosing an appropriate proximity measure (similarity or distance) of great importance achieve satisfactory results. Nevertheless, up date, there are no comprehensive guidelines concerning how choose measures for Pearson most used measure, whereas characteristics other ones remain unexplored. In this paper, we investigate choice data by evaluating performance 16 in 52 sets time course and cancer experiments. Our results support that rarely employed literature can provide better than commonly ones, such as Pearson, Spearman, euclidean distance. Given different stood out evaluations, their should be specific each scenario. To evaluate on time-course data, preprocessed compiled 17 benchmark along with new methodology, called Intrinsic Biological Separation Ability (IBSA). Both future research assess effectiveness