An Empirical Comparison of Machine Learning Techniques in Predicting the Bug Severity of Open and Closed Source Projects

作者: K. K. Chaturvedi , V.B. Singh

DOI: 10.4018/JOSSP.2012040103

关键词:

摘要: Bug severity is the degree of impact that a defect has on development or operation component system, and can be classified into different levels based their system. Identification level useful for bug triager in allocating to concerned fixer. Various researchers have attempted text mining techniques predicting bugs, detection duplicate reports assignment bugs suitable fixer its fix. In this paper, an attempt been made compare performance machine learning namely Support vector SVM, probability Naive Bayes NB, Decision Tree J48 A Java implementation C4.5, rule Repeated Incremental Pruning Produce Error Reduction RIPPER Random Forests RF learners 1 5 reported by analyzing summary short description reports. The report data taken from NASA's PITS Projects Issue Tracking System datasets as closed source components Eclipse, Mozilla & GNOME open projects. analysis carried out RapidMiner STATISTICA tools. authors measured considering i value accuracy F-Measure all ii number best cases at threshold F-Measure.

参考文章(57)
Ahmed Lamkanfi, Serge Demeyer, Emanuel Giger, Bart Goethals, Predicting the severity of a reported bug 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010). pp. 1- 10 ,(2010) , 10.1109/MSR.2010.5463284
Philip J. Guo, Thomas Zimmermann, Nachiappan Nagappan, Brendan Murphy, Characterizing and predicting which bugs get fixed Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - ICSE '10. ,vol. 1, pp. 495- 504 ,(2010) , 10.1145/1806799.1806871
Sholom M. Weiss, Brian F. White, Chidanand V. Apte, Lightweight document clustering european conference on principles of data mining and knowledge discovery. pp. 665- 672 ,(2000) , 10.1007/3-540-45372-5_82
Fabrizio Sebastiani, Machine learning in automated text categorization ACM Computing Surveys. ,vol. 34, pp. 1- 47 ,(2002) , 10.1145/505282.505283
Lian Yu, Wei-Tek Tsai, Wei Zhao, Fang Wu, Predicting defect priority based on neural networks advanced data mining and applications. pp. 356- 367 ,(2010) , 10.1007/978-3-642-17313-4_35
Jesus M. Gonzalez-Barahona, Daniel Izquierdo-Cortazar, Megan Squire, Repositories with Public Data about Software Development International Journal of Open Source Software and Processes. ,vol. 2, pp. 1- 13 ,(2010) , 10.4018/JOSSP.2010040101
Pieter Hooimeijer, Westley Weimer, Modeling bug report quality automated software engineering. pp. 34- 43 ,(2007) , 10.1145/1321631.1321639
Erik Linstead, Pierre Baldi, Mining the coherence of GNOME bug reports with statistical topic models 2009 6th IEEE International Working Conference on Mining Software Repositories. pp. 99- 102 ,(2009) , 10.1109/MSR.2009.5069486
Ashish Sureka, Pankaj Jalote, Detecting Duplicate Bug Report Using Character N-Gram-Based Features asia-pacific software engineering conference. pp. 366- 374 ,(2010) , 10.1109/APSEC.2010.49
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten, The WEKA data mining software ACM SIGKDD Explorations Newsletter. ,vol. 11, pp. 10- 18 ,(2009) , 10.1145/1656274.1656278