Software Mining Studies: Goals, Approaches, Artifacts, and Replicability

作者: Sven Amann , Stefanie Beyer , Katja Kevic , Harald Gall

DOI: 10.1007/978-3-319-28406-4_5

关键词: Change requestSoftware developmentSoftware qualityDocumentationSoftware miningSoftware repositoryData scienceArtifact (software development)Computer scienceSoftware

摘要: The mining of software archives has enabled new ways for increasing the productivity in development: Analyzing quality, project evolution, investigating change patterns and evolution trends, models development processes, developing methods integrating mined data from various historical sources, or analyzing natural language artifacts repositories, are examples research topics. Software repositories include data, ranging source control systems, issue tracking artifact such as requirements, design architectural documentation, to archived communication between members. Practitioners researchers have recognized potential these sources support maintenance software, improve their architecture, empirically validate techniques processes. We revisited studies that were published recent years top venues engineering, ICSE, ESEC/FSE, MSR. In studies, we highlight different viewpoints: pursued goals, state-of-the-art approaches, artifacts, study replicability. To analyze (lexically) analyzed papers more than a decade. terms replicability looked at existing work field tools, platforms. address issues reproducibility shed light onto challenges large-scale would enable stronger conclusion stability.

参考文章(134)
Jelber Sayyad Shirabad, Tim Menzies, The \{PROMISE\} Repository of Software Engineering Databases. ,(2005)
Jayalath Ekanayake, Jonas Tappolet, Harald C. Gall, Abraham Bernstein, Time variance and defect prediction in software projects Empirical Software Engineering. ,vol. 17, pp. 348- 389 ,(2012) , 10.1007/S10664-011-9180-X
Steve Versteeg, Liang Gong, Hongyu Zhang, Predicting bug-fixing time: an empirical study of commercial software projects international conference on software engineering. pp. 1042- 1051 ,(2013) , 10.5555/2486788.2486931
Leandro L. Minku, Xin Yao, How to make best use of cross-company data in software effort estimation? international conference on software engineering. pp. 446- 456 ,(2014) , 10.1145/2568225.2568228
Hiroaki Murakami, Yoshiki Higo, Shinji Kusumoto, A dataset of clone references with gaps Proceedings of the 11th Working Conference on Mining Software Repositories - MSR 2014. pp. 412- 415 ,(2014) , 10.1145/2597073.2597133
Hung Viet Nguyen, Hoan Anh Nguyen, Anh Tuan Nguyen, Tien N. Nguyen, Mining interprocedural, data-oriented usage patterns in JavaScript web applications international conference on software engineering. pp. 791- 802 ,(2014) , 10.1145/2568225.2568302
Igor Steinmacher, Igor Scaliante Wiese, Tayana Conte, Marco Aurélio Gerosa, David Redmiles, The hard life of open source software project newcomers international conference on software engineering. pp. 72- 78 ,(2014) , 10.1145/2593702.2593704
Marcel Bruch, Martin Monperrus, Mira Mezini, Learning from examples to improve code completion systems foundations of software engineering. pp. 213- 222 ,(2009) , 10.1145/1595696.1595728
Giacomo Ghezzi, Harald C. Gall, A framework for semi-automated software evolution analysis composition automated software engineering. ,vol. 20, pp. 463- 496 ,(2013) , 10.1007/S10515-013-0125-Z
Tim Menzies, Thomas Zimmermann, Software Analytics: So What? IEEE Software. ,vol. 30, pp. 31- 37 ,(2013) , 10.1109/MS.2013.86