The impact of using biased performance metrics on software defect prediction research.

作者: Martin J. Shepperd , Jingxiu Yao

DOI:

关键词: Potential impactMatthews correlation coefficientContext (language use)Machine learningMetric (mathematics)Software bugComputer scienceArtificial intelligencePrediction algorithmsPublicationPairwise comparison

摘要: … Although not a new finding we show that F1 is problematic when used in a two-class problem domain such as software defect prediction. By enumerating and then plotting all N = 40 …

参考文章(70)
David Martin Ward Powers, None, Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation arXiv: Learning. ,vol. 2, pp. 37- 63 ,(2011)
David Martin Powers, Recall & Precision versus The Bookmaker Cognitive Science. ,(2003)
Feng Zhang, Audris Mockus, Iman Keivanloo, Ying Zou, Towards building a universal defect prediction model with rank transformed predictors Empirical Software Engineering. ,vol. 21, pp. 2107- 2145 ,(2016) , 10.1007/S10664-015-9396-2
David Bowes, Tracy Hall, Jean Petrić, Different Classifiers Find Different Defects Although With Different Level of Consistency predictive models in software engineering. pp. 3- ,(2015) , 10.1145/2810146.2810149
Nikolaos Mittas, Ioannis Mamalikidis, Lefteris Angelis, A framework for comparing multiple cost estimation methods using an automated visualization toolkit Information & Software Technology. ,vol. 57, pp. 310- 328 ,(2015) , 10.1016/J.INFSOF.2014.05.010
YANMIN SUN, ANDREW K. C. WONG, MOHAMED S. KAMEL, CLASSIFICATION OF IMBALANCED DATA: A REVIEW International Journal of Pattern Recognition and Artificial Intelligence. ,vol. 23, pp. 687- 719 ,(2009) , 10.1142/S0218001409007326
Cagatay Catal, Banu Diri, A systematic review of software fault prediction studies Expert Systems With Applications. ,vol. 36, pp. 7346- 7354 ,(2009) , 10.1016/J.ESWA.2008.10.027
David Bowes, Tracy Hall, David Gray, DConfusion: a technique to allow cross study performance evaluation of fault prediction studies automated software engineering. ,vol. 21, pp. 287- 313 ,(2014) , 10.1007/S10515-013-0129-8
Andrew Gelman, David K. Park, Splitting a Predictor at the Upper Quarter or Third and the Lower Quarter or Third The American Statistician. ,vol. 63, pp. 1- 8 ,(2009) , 10.1198/TAST.2009.0001