作者: Sebastian Proksch , Sven Amann , Sarah Nadi , Mira Mezini
关键词: Recommender system 、 Code (cryptography) 、 Data mining 、 Reality check 、 Quality (business) 、 Context (language use) 、 Computer science 、 Software 、 Information retrieval 、 Empirical research
摘要: While researchers develop many new exciting code recommender systems, such as method-call completion, code-snippet or search, an accurate evaluation of systems is always a challenge. We analyzed the current literature and found that most evaluations rely on artificial queries extracted from released code, which begs question: Do reflect real-life usages? To answer this question, we capture 6,189 fine-grained development histories real IDE interactions. use them ground truth extract 7,157 for specific system. compare results with different strategies check several assumptions are repeatedly used in research, but never empirically evaluated. find evolving context often observed practice has major effect prediction quality not commonly reflected evaluations.