A Method Based on Genetic Programming for Improving the Quality of Datasets in Classification Problems

作者: Ricardo Aler , César Estébanez , José María Valls

DOI:

关键词:

摘要: The problem of the representation data is a key issue in Machine Learning (ML) field. ML tries to automatically induct knowledge from set examples or instances problem, learning how distinguish between different classes. It known that inappropriate representations can drastically limit performance algorithms. On other hand, high-quality same data, produce strong improvement classification rates. In this work we present GP-based method for evolve projections. These projections change space into higher-quality one, thus improving At time, our approach reduce dimensionality by constructing more relevant attributes. We have tested four domains. experiments show it obtains good results, compared approaches do not use projections, while reducing many cases.

参考文章(17)
Fernando E. B. Otero, Monique M. S. Silva, Alex A. Freitas, Julio C. Nievola, Genetic programming for attribute construction in data mining genetic and evolutionary computation conference. pp. 1270- 1270 ,(2002) , 10.1007/3-540-36599-0_36
I. Kuscu, A genetic constructive induction model congress on evolutionary computation. ,vol. 1, pp. 212- 217 ,(1999) , 10.1109/CEC.1999.781928
Tom E. Fawcett, Paul E. Utgoff, A Hybrid Method for Feature Generation Machine Learning Proceedings 1991. pp. 137- 141 ,(1991) , 10.1016/B978-1-55860-200-7.50031-3
Krzysztof Krawiec, Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks Genetic Programming and Evolvable Machines. ,vol. 3, pp. 329- 343 ,(2002) , 10.1023/A:1020984725014
Tom Howley, Michael G. Madden, The Genetic Kernel Support Vector Machine: Description and Evaluation Artificial Intelligence Review. ,vol. 24, pp. 379- 395 ,(2005) , 10.1007/S10462-005-9009-3