Evaluating the evaluations of code recommender systems: a reality check

作者： Sebastian Proksch , Sven Amann , Sarah Nadi , Mira Mezini

关键词: Recommender system 、 Code (cryptography) 、 Data mining 、 Reality check 、 Quality (business) 、 Context (language use) 、 Computer science 、 Software 、 Information retrieval 、 Empirical research

摘要: While researchers develop many new exciting code recommender systems, such as method-call completion, code-snippet or search, an accurate evaluation of systems is always a challenge. We analyzed the current literature and found that most evaluations rely on artificial queries extracted from released code, which begs question: Do reflect real-life usages? To answer this question, we capture 6,189 fine-grained development histories real IDE interactions. use them ground truth extract 7,157 for specific system. compare results with different strategies check several assumptions are repeatedly used in research, but never empirically evaluated. find evolving context often observed practice has major effect prediction quality not commonly reflected evaluations.

uni-trier.de PDF 下载加速

sci-hub.se PDF 下载加速

参考文章(28)

Stas Negara, Mohsen Vakilian, Nicholas Chen, Ralph E. Johnson, Danny Dig, Is It Dangerous to Use Version Control Histories to Study Source Code Evolution? ECOOP 2012 – Object-Oriented Programming. pp. 79- 103 ,(2012) , 10.1007/978-3-642-31057-7_5

Hao Zhong, Tao Xie, Lu Zhang, Jian Pei, Hong Mei, MAPO: Mining and Recommending API Usage Patterns european conference on object oriented programming. pp. 318- 343 ,(2009) , 10.1007/978-3-642-03013-0_15

Marcel Bruch, Martin Monperrus, Mira Mezini, Learning from examples to improve code completion systems foundations of software engineering. pp. 213- 222 ,(2009) , 10.1145/1595696.1595728

Gabriele Bavota, Rocco Oliveto, Massimiliano Di Penta, Andrian Marcus, Laura Moreno, How can I use this method international conference on software engineering. ,vol. 1, pp. 880- 890 ,(2015) , 10.5555/2818754.2818860

Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Michele Lanza, Mining StackOverflow to turn the IDE into a self-confident programming prompter mining software repositories. pp. 102- 111 ,(2014) , 10.1145/2597073.2597077

Carsten Kolassa, Dirk Riehle, Michel A. Salim, The empirical commit frequency distribution of open source projects international symposium on open collaboration. pp. 18- ,(2013) , 10.1145/2491055.2491073

Mik Kersten, Gail C. Murphy, Mylar: a degree-of-interest model for IDEs aspect-oriented software development. pp. 159- 168 ,(2005) , 10.1145/1052898.1052912

Romain Robbes, Michele Lanza, Improving code completion with program history automated software engineering. ,vol. 17, pp. 181- 212 ,(2010) , 10.1007/S10515-010-0064-X

Stas Negara, Mihai Codoban, Danny Dig, Ralph E. Johnson, Mining fine-grained code changes to detect unknown change patterns international conference on software engineering. pp. 803- 813 ,(2014) , 10.1145/2568225.2568317

10.

Andrea Mocci, Michele Lanza, Roberto Minelli, The plague doctor: a promising cure for the window plague international conference on program comprehension. pp. 182- 185 ,(2015) , 10.5555/2820282.2820309

Evaluating the evaluations of code recommender systems: a reality check

来源期刊

我的账户

Evaluating the evaluations of code recommender systems: a reality check

来源期刊

相似文章 10

我的账户