Design and analysis of clustering algorithms for numerical, categorical and mixed data

作者: Maria Del Mar , Suarez Alvarez

DOI:

关键词:

摘要: In recent times, several machine learning techniques have been applied successfully to discover useful knowledge from data. Cluster analysis that aims at finding similar subgroups a large heterogeneous collection of records, is one o f the most and popular available data mining. The purpose this research design and analyse clustering algorithms for numerical, categorical mixed sets. Most are limited either numerical or categorical attributes. Datasets with types attributes common in real life so sets quite timely. Determining optimal solution problem NP-hard. Therefore, it necessary find solutions regarded as “good enough” quickly. Similarity fundamental concept definition cluster. It very calculate similarity dissimilarity between two features using distance measure. Attributes ranges will implicitly assign larger contributions metrics than the application small ranges. There only few papers especially devoted normalisation methods. Usually scaled unit range. This does not secure equal average all For reason, main part thesis devoted normalisation.

参考文章(120)
D.T. Pham, A. Ghanbarzadeh, E. Koç, S. Otri, S. Rahim, M. Zaidi, THE BEES ALGORITHM, A NOVEL TOOL FOR COMPLEX OPTIMISATION PROBLEMS Intelligent Production Machines and Systems#R##N#2nd I*PROMS Virtual International Conference 3–14 July 2006. pp. 454- 459 ,(2006) , 10.1016/B978-008045157-2/50081-X
James C. Bezdek, Some Non-Standard Clustering Algorithms Springer, Berlin, Heidelberg. pp. 225- 287 ,(1987) , 10.1007/978-3-642-70880-0_6
Antanas Laurinčikas, Ramūnas Garunkštis, Euler Gamma-Function The Lerch Zeta-function. pp. 1- 15 ,(2003) , 10.1007/978-94-017-6401-8_1
Richard C. Dubes, Cluster analysis and related issues Handbook of pattern recognition & computer vision. pp. 3- 32 ,(1993)
Robert Tibshirani, Trevor Hastie, Jerome H. Friedman, The Elements of Statistical Learning ,(2001)