RefiNym: using names to refine types

作者: Santanu Kumar Dash , Miltiadis Allamanis , Earl T. Barr

DOI: 10.1145/3236024.3236042

关键词: Code (cryptography)String (computer science)Source codeVariable (computer science)Control flow graphIdentifierScope (computer science)Information retrievalNatural languageComputer science

摘要: Source code is bimodal: it combines a formal, algorithmic channel and natural language of identifiers comments. In this work, we model the bimodality with name flows, an assignment flow graph augmented to track identifier names. Conceptual types are logically distinct that do not always coincide program types. Passwords URLs example conceptual can share type string. Our tool, RefiNym, unsupervised method mines lattice from flows reifies them into nominal For string, RefiNym finds splits originally merged single type, reducing number same-type variables per scope 8.7 2.2 while eliminating 21.9% scopes have more than one variable in scope. This makes self-documenting frees system prevent developer inadvertently assigning data across

参考文章(26)
Yee Whye Teh, Michael I Jordan, None, Hierarchical Bayesian nonparametric models with applications Bayesian Nonparametrics. pp. 158- 207 ,(2010) , 10.1017/CBO9780511802478.006
Xiao Ling, Sameer Singh, Daniel S. Weld, Design Challenges for Entity Linking Transactions of the Association for Computational Linguistics. ,vol. 3, pp. 315- 328 ,(2015) , 10.1162/TACL_A_00141
Charlotte Cabanel, Beyond the lines international conference on computer graphics and interactive techniques. pp. 1- 1 ,(2014) , 10.1145/2671032.2671065
Jia Deng, J. Krause, A. C. Berg, Li Fei-Fei, Hedging your bets: Optimizing accuracy-specificity trade-offs in large scale visual recognition computer vision and pattern recognition. pp. 3450- 3457 ,(2012) , 10.1109/CVPR.2012.6248086
Venera Arnaoudova, Massimiliano Di Penta, Giuliano Antoniol, Yann-Gaël Guéhéneuc, A New Family of Software Anti-patterns: Linguistic Anti-patterns conference on software maintenance and reengineering. pp. 187- 196 ,(2013) , 10.1109/CSMR.2013.28
Venera Arnaoudova, Laleh M. Eshkevari, Massimiliano Di Penta, Rocco Oliveto, Giuliano Antoniol, Yann-Gael Gueheneuc, REPENT: Analyzing the Nature of Identifier Renamings IEEE Transactions on Software Engineering. ,vol. 40, pp. 502- 532 ,(2014) , 10.1109/TSE.2014.2312942
Irfan Ul Haq, Juan Caballero, Michael D. Ernst, Ayudante: identifying undesired variable interactions international workshop on dynamic analysis. pp. 8- 13 ,(2015) , 10.1145/2823363.2823366
E. Gokcay, J.C. Principe, Information theoretic clustering IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 24, pp. 158- 171 ,(2002) , 10.1109/34.982897
Mark Harman, Bryan F Jones, Search-based software engineering Information & Software Technology. ,vol. 43, pp. 833- 839 ,(2001) , 10.1016/S0950-5849(01)00189-6