CLASSIFICATION OF IMBALANCED DATA: A REVIEW

作者: YANMIN SUN , ANDREW K. C. WONG , MOHAMED S. KAMEL

DOI: 10.1142/S0218001409007326

关键词:

摘要: Classification of data with imbalanced class distribution has encountered a significant drawback of the performance attainable by most standard classifier learning algorithms which assume a relatively balanced class distribution and equal misclassification costs. This paper provides a review of the classification of imbalanced data regarding: the application domains; the nature of the problem; the learning difficulties with standard classifier learning algorithms; the learning objectives and evaluation measures; the reported research …

参考文章(76)
Gary M. Weiss, Mining with rarity ACM SIGKDD Explorations Newsletter. ,vol. 6, pp. 7- 19 ,(2004) , 10.1145/1007730.1007734
Nitesh V. Chawla, Lawrence O. Hall, Ajay Joshi, Wrapper-based computation and evaluation of sampling methods for imbalanced datasets Proceedings of the 1st international workshop on Utility-based data mining - UBDM '05. pp. 24- 33 ,(2005) , 10.1145/1089827.1089830
Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504
PATRICIA RIDDLE, RICHARD SEGAL, OREN ETZIONI, REPRESENTATION DESIGN AND BRUTE-FORCE INDUCTION IN A BOEING MANUFACTURING DOMAIN Applied Artificial Intelligence. ,vol. 8, pp. 125- 147 ,(1994) , 10.1080/08839519408945435
Gustavo E. A. P. A. Batista, Ronaldo C. Prati, Maria Carolina Monard, A study of the behavior of several methods for balancing machine learning training data ACM SIGKDD Explorations Newsletter. ,vol. 6, pp. 20- 29 ,(2004) , 10.1145/1007730.1007735
J.R. Quinlan, Improved Estimates for the Accuracy of Small Disjuncts Machine Learning. ,vol. 6, pp. 93- 98 ,(1991) , 10.1023/A:1022646118217
Bing Liu, Wynne Hsu, Yiming Ma, Mining association rules with multiple minimum supports knowledge discovery and data mining. pp. 337- 341 ,(1999) , 10.1145/312129.312274
Jerome Friedman, Trevor Hastie, Robert Tibshirani, Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors) Annals of Statistics. ,vol. 28, pp. 337- 407 ,(2000) , 10.1214/AOS/1016218223
Robert E. Schapire, Yoram Singer, Improved boosting algorithms using confidence-rated predictions conference on learning theory. ,vol. 37, pp. 80- 91 ,(1998) , 10.1145/279943.279960
K Carvajal, M Chacón, D Mery, G Acuna, Neural network method for failure detection with skewed class distribution Insight. ,vol. 46, pp. 399- 402 ,(2004) , 10.1784/INSI.46.7.399.55578