Mapping Language to Code in Programmatic Context

作者: Srinivasan Iyer , Ioannis Konstas , Alvin Cheung , Luke Zettlemoyer

DOI: 10.18653/V1/D18-1192

关键词: Computer sciencesortSource codeCode (cryptography)Class (computer programming)Context (computing)DocumentationProgramming languageTask (project management)Member variable

摘要: Source code is rarely written in isolation. It depends significantly on the programmatic context, such as class that would reside in. To study this phenomenon, we introduce task of generating member functions given English documentation and context provided by rest class. This challenging because desired can vary greatly depending functionality provides (e.g., a sort function may or not be available when are asked to “return smallest element” particular variable list). We CONCODE, new large dataset with over 100,000 examples consisting Java classes from online repositories, develop encoder-decoder architecture models interaction between method environment. also present detailed error analysis suggesting there significant room for future work task.

参考文章(27)
Daniel Gildea, James Allen, Mehdi Manshadi, Integrating programming by example and natural language programming national conference on artificial intelligence. pp. 661- 667 ,(2013)
Luke S. Zettlemoyer, Michael Collins, Learning to map sentences to logical form: structured classification with probabilistic categorial grammars uncertainty in artificial intelligence. pp. 658- 666 ,(2005)
Daniel Tarlow, Andrew Gordon, Yi Wei, Miltos Allamanis, Bimodal Modelling of Source Code and Natural Language international conference on machine learning. pp. 2123- 2132 ,(2015)
Thang Luong, Hieu Pham, Christopher D. Manning, Effective Approaches to Attention-based Neural Machine Translation empirical methods in natural language processing. pp. 1412- 1421 ,(2015) , 10.18653/V1/D15-1166
Sepp Hochreiter, Jürgen Schmidhuber, Long short-term memory Neural Computation. ,vol. 9, pp. 1735- 1780 ,(1997) , 10.1162/NECO.1997.9.8.1735
Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, BLEU Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02. pp. 311- 318 ,(2001) , 10.3115/1073083.1073135
Miltiadis Allamanis, Charles Sutton, Mining idioms from source code foundations of software engineering. pp. 472- 483 ,(2014) , 10.1145/2635868.2635901
Miltiadis Allamanis, Charles Sutton, Mining source code repositories at massive scale using language modeling mining software repositories. pp. 207- 216 ,(2013) , 10.1109/MSR.2013.6624029
Chris Quirk, Raymond Mooney, Michel Galley, Language to Code: Learning Semantic Parsers for If-This-Then-That Recipes Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ,vol. 1, pp. 878- 888 ,(2015) , 10.3115/V1/P15-1085
Percy Liang, Michael I. Jordan, Dan Klein, Learning Dependency-Based Compositional Semantics meeting of the association for computational linguistics. ,vol. 39, pp. 590- 599 ,(2011) , 10.1162/COLI_A_00127