B-ROC curves for the assessment of classifiers over imbalanced data sets

作者: John S. Baras , Alvaro A. Cárdenas

DOI:

关键词: Receiver operating characteristicImbalanced dataClass (biology)Data miningMachine learningKey (cryptography)Computer scienceArtificial intelligenceSet (abstract data type)Bayesian probability

摘要: The class imbalance problem appears to be ubiquitous a large portion of the machine learning and data mining communities. One key questions in this setting is how evaluate algorithms case imbalances. In paper we introduce Bayesian Receiver Operating Characteristic (B-ROC) curves, as set tradeoff curves that combine an intuitive way, variables are more relevant evaluation classifiers over imbalanced sets. This presentation based on section 4 (Cardenas, Baras, & Seamon 2006).

参考文章(7)
H. Vincent Poor, An introduction to signal detection and estimation (2nd ed.) Springer-Verlag New York, Inc.. ,(1994)
Stefan Axelsson, The base-rate fallacy and its implications for the difficulty of intrusion detection computer and communications security. pp. 1- 7 ,(1999) , 10.1145/319709.319710
Nitesh V. Chawla, Nathalie Japkowicz, Aleksander Kotcz, Editorial ACM SIGKDD Explorations Newsletter. ,vol. 6, pp. 1- 6 ,(2004) , 10.1145/1007730.1007733
Foster Provost, Tom Fawcett, Robust Classification for Imprecise Environments Machine Learning. ,vol. 42, pp. 203- 231 ,(2001) , 10.1023/A:1007601015854
A.A. Cardenas, J.S. Baras, K. Seamon, A framework for the evaluation of intrusion detection systems ieee symposium on security and privacy. pp. 63- 77 ,(2006) , 10.1109/SP.2006.2
Chris Drummond, Robert C. Holte, Explicitly representing expected cost: an alternative to ROC representation knowledge discovery and data mining. pp. 198- 207 ,(2000) , 10.1145/347090.347126