Will they like this?: evaluating code contributions with language models

作者: Alberto Bacchelli , Premkumar T. Devanbu , Vincent J. Hellendoorn

DOI: 10.5555/2820518.2820539

关键词:

摘要: Popular open-source software projects receive and review contributions from a diverse array of developers, many whom have little to no prior involvement with the project. A recent survey reported that reviewers consider conformance project's code style be one top priorities when evaluating on Github. We propose quantitatively evaluate existence effects this phenomenon. To aim we use language models, which were shown accurately capture stylistic aspects code. find rejected change sets do contain significantly less similar project than accepted ones, furthermore, are more likely subject thorough review. Armed these results further investigate whether new contributors learn conform experience is positively correlated style.

参考文章(33)
G. Gousios, A. Zaidman, Storey, A. Van Deursen, Work practices and challenges in pull-based development: The integrator's perspective Technical Report Series TUD-SERG-2014-013. ,(2014)
Peter C. Rigby, Daniel M. German, Laura Cowen, Margaret-Anne Storey, Peer Review on Open-Source Software Projects: Parameters, Statistical Models, and Theory international conference on software engineering. ,vol. 23, pp. 35- ,(2014) , 10.1145/2594458
Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, Tien N. Nguyen, A statistical semantic language model for source code foundations of software engineering. pp. 532- 542 ,(2013) , 10.1145/2491411.2491458
Ronald Rosenfeld, A maximum entropy approach to adaptive statistical language modelling Computer Speech & Language. ,vol. 10, pp. 187- 228 ,(1996) , 10.1006/CSLA.1996.0011
Moritz Beller, Alberto Bacchelli, Andy Zaidman, Elmar Juergens, Modern code reviews in open-source projects: which problems do they fix? mining software repositories. pp. 202- 211 ,(2014) , 10.1145/2597073.2597082
Olga Baysal, Oleksii Kononenko, Reid Holmes, Michael W. Godfrey, The influence of non-technical factors on code review working conference on reverse engineering. pp. 122- 131 ,(2013) , 10.1109/WCRE.2013.6671287
Vincent J. Della Pietra, Stephen A. Della Pietra, Robert L. Mercer, Peter F. Brown, The mathematics of statistical machine translation: parameter estimation Computational Linguistics. ,vol. 19, pp. 263- 311 ,(1993)
Daryl Posnett, Vladimir Filkov, Premkumar Devanbu, Ecological inference in empirical software engineering automated software engineering. pp. 362- 371 ,(2011) , 10.1109/ASE.2011.6100074
Alberto Bacchelli, Christian Bird, Expectations, outcomes, and challenges of modern code review international conference on software engineering. pp. 712- 721 ,(2013) , 10.5555/2486788.2486882