作者: Santanu Kumar Dash , Miltiadis Allamanis , Earl T. Barr
关键词: Code (cryptography) 、 String (computer science) 、 Source code 、 Variable (computer science) 、 Control flow graph 、 Identifier 、 Scope (computer science) 、 Information retrieval 、 Natural language 、 Computer science
摘要: Source code is bimodal: it combines a formal, algorithmic channel and natural language of identifiers comments. In this work, we model the bimodality with name flows, an assignment flow graph augmented to track identifier names. Conceptual types are logically distinct that do not always coincide program types. Passwords URLs example conceptual can share type string. Our tool, RefiNym, unsupervised method mines lattice from flows reifies them into nominal For string, RefiNym finds splits originally merged single type, reducing number same-type variables per scope 8.7 2.2 while eliminating 21.9% scopes have more than one variable in scope. This makes self-documenting frees system prevent developer inadvertently assigning data across