Improved Self-Generating Prototypes Algorithm for Imbalanced Datasets

作者: D. V. R. Oliveira , G. R. Magalhaes , G. D. C. Cavalcanti , T. I. Ren

DOI: 10.1109/ICTAI.2012.126

关键词:

摘要: Some real world datasets have different proportions of classes, too many instances the majority classes and only a few minority those are called imbalanced datasets. Many applications, like medical diagnosis risk analysis, interested in under-represented class, but classifiers prototype generation techniques usually bias towards classes. Because that, problem classification with has become an important topic Pattern Recognition. The Self-Generating Prototypes (SGP) high reduction power excelent performance balanced datasets, but, generated prototypes do not good representation training dataset. This algorithm generates few, or even none, aim this paper is to propose Adaptive (ASGP), improvement SGP2, second version SGP, designed handle also exposes reasons for low SGP2 such Empirical results show that ASGP higher than especially when it comes accuracy

参考文章(8)
Janez Demšar, Statistical Comparisons of Classifiers over Multiple Data Sets Journal of Machine Learning Research. ,vol. 7, pp. 1- 30 ,(2006)
Alberto Fernández, María José del Jesus, Francisco Herrera, On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets Information Sciences. ,vol. 180, pp. 1268- 1291 ,(2010) , 10.1016/J.INS.2009.12.014
Qiang Yang, Xindong Wu, None, 10 CHALLENGING PROBLEMS IN DATA MINING RESEARCH International Journal of Information Technology and Decision Making. ,vol. 05, pp. 597- 604 ,(2006) , 10.1142/S0219622006002258
Salvador Garcı´a, Joaquı´n Derrac, Isaac Triguero, Cristóbal J. Carmona, Francisco Herrera, Evolutionary-based selection of generalized instances for imbalanced classification Knowledge Based Systems. ,vol. 25, pp. 3- 12 ,(2012) , 10.1016/J.KNOSYS.2011.01.012
Jin Huang, C.X. Ling, Using AUC and accuracy in evaluating learning algorithms IEEE Transactions on Knowledge and Data Engineering. ,vol. 17, pp. 299- 310 ,(2005) , 10.1109/TKDE.2005.50
Hatem A. Fayed, Sherif R. Hashem, Amir F. Atiya, Self-generating prototypes for pattern classification Pattern Recognition. ,vol. 40, pp. 1498- 1509 ,(2007) , 10.1016/J.PATCOG.2006.10.018
Alberto Fernández, María José del Jesus, Francisco Herrera, Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced data-sets International Journal of Approximate Reasoning. ,vol. 50, pp. 561- 577 ,(2009) , 10.1016/J.IJAR.2008.11.004
Cristiano de S. Pereira, George D. C. Cavalcanti, Prototype selection: Combining self-generating prototypes and Gaussian mixtures for pattern classification international joint conference on neural network. pp. 3505- 3510 ,(2008) , 10.1109/IJCNN.2008.4634298