作者: Pablo A. Jaskowiak , Ricardo J. G. B. Campello , Ivan G. Costa
DOI: 10.1007/978-3-642-31927-3_11
关键词:
摘要: Cluster analysis is usually the first step adopted to unveil information from gene expression data. One of its common applications clustering cancer samples, associated with detection previously unknown subtypes. Although guidelines have been established concerning choice appropriate algorithms, little attention has given subject proximity measures. Whereas Pearson correlation coefficient appears as de facto measure in this scenario, no comprehensive study analyzing other coefficients alternatives it conducted. Considering such facts, we evaluated five (along Euclidean distance) regarding samples. Our evaluation was conducted on 35 publicly available datasets covering both (i) intrinsic separation ability and (ii) predictive coefficients. results support that rarely considered literature may provide competitive more generally employed ones. Finally, show a recently introduced arises promising alternative commonly Pearson, providing even superior it.