Coping with new Challenges in Clustering and Biomedical Imaging

作者: Annahita Oswald

DOI:

关键词:

摘要: The last years have seen a tremendous increase of data acquisition in different scientific fields such as molecular biology, bioinformatics or biomedicine. Therefore, novel methods are needed for automatic processing and analysis this large amount data. Data mining is the process applying like clustering classification to databases order uncover hidden patterns. Clustering task partitioning points set into distinct groups minimize intra cluster similarity maximize inter similarity. In contrast unsupervised learning clustering, problem known supervised that aims at prediction group membership objects on basis rules learned from training where known. Specialized been proposed hierarchical clustering. However, these suffer several drawbacks. first part work, new cope with problems conventional algorithms. ITCH (Information-Theoretic Cluster Hierarchies) method based variant Minimum Description Length (MDL) principle which finds hierarchies clusters without requiring input parameters. As may converge only local optimum we propose GACH (Genetic Algorithm Finding combines benefits genetic algorithms information-theory. way search space explored more effectively. Furthermore, INTEGRATE mixed numerical categorical attributes. Supported by MDL our integrates information provided heterogeneous attributes thus naturally balances influence both sources information. A competitive evaluation illustrates effective than existing type Besides single provide solution sets represented their skylines. skyline operator well-established database primitive finding two an unknown weighting between thesis, define measure, called SkyDist, comparing skylines can directly be integrated tasks classification. experiments show SkyDist combination give useful insights many applications. In second part, focus high resolution magnetic resonance images (MRI) clinically relevant allow early detection diagnosis diseases. particular, framework Alzheimer's disease MR combining steps feature selection, result, highly selective features discriminating patients Alzheimer healthy people has identified. dimensional extremely time-consuming. Therefore developed JGrid, scalable distributed computing designed scale MRI optimized diagnosis. another study apply efficient motif discovery task-fMRI scans identify patterns brain characteristic somatoform pain disorder. We find compartments occur frequently within networks discriminate well among diseased people.

参考文章(130)
D. J. Cook, L. B. Holder, Substructure discovery using minimum description length and background knowledge Journal of Artificial Intelligence Research. ,vol. 1, pp. 231- 255 ,(1993) , 10.1613/JAIR.43
J. Ross Quinlan, C4.5: Programs for Machine Learning ,(1992)
R. Cilibrasi, P.M.B. Vitanyi, Clustering by compression international symposium on information theory. ,vol. 51, pp. 1523- 1545 ,(2003) , 10.1109/TIT.2005.844059
Gregory Piatetsky-Shapiro, Usama Fayyad, Padhraic Smyth, Knowledge discovery and data mining: towards a unifying framework knowledge discovery and data mining. pp. 82- 88 ,(1996)
John Ashburner, Jesper L.R. Andersson, Karl J. Friston, High-Dimensional Image Registration Using Symmetric Priors NeuroImage. ,vol. 9, pp. 619- 628 ,(1999) , 10.1006/NIMG.1999.0437
R. I. Scahill, J. M. Schott, J. M. Stevens, M. N. Rossor, N. C. Fox, Mapping the evolution of regional atrophy in Alzheimer's disease: Unbiased analysis of fluid-registered serial MRI Proceedings of the National Academy of Sciences of the United States of America. ,vol. 99, pp. 4703- 4707 ,(2002) , 10.1073/PNAS.052587399
M. Kuramochi, G. Karypis, Frequent subgraph discovery international conference on data mining. pp. 313- 320 ,(2001) , 10.1109/ICDM.2001.989534
S Wessely, C Nimnuan, M Sharpe, Functional somatic syndromes: one or many? The Lancet. ,vol. 354, pp. 936- 939 ,(1999) , 10.1016/S0140-6736(98)08320-2
Sugato Basu, Mikhail Bilenko, Raymond J. Mooney, A probabilistic framework for semi-supervised clustering knowledge discovery and data mining. pp. 59- 68 ,(2004) , 10.1145/1014052.1014062
Micheline Kamber, Jiawei Han, Jian Pei, Data Mining: Concepts and Techniques ,(2000)