作者: Michael Van Emmerik
DOI:
关键词: Indirect branch 、 Source code 、 Machine code 、 Static single assignment form 、 Executable 、 Computer science 、 Alias analysis 、 Algorithm 、 Programming language 、 Decompiler 、 Switch statement
摘要: Static Single Assignment enables the efficient implementation of many important decompiler components, including expression propagation, preservation analysis, type and analysis indirect jumps calls. Source code is an essential part all software development. It so valuable that when it not available, can be worthwhile deriving from executable form computer programs through process decompilation. There are applications for decompiled source code, inspections malware, bugs, vulnerabilities; interoperability; maintenance application has some or missing. Existing machine decompilers, in contrast to existing decompilers Java similar platforms, have significant deficiencies. These include poor recovery parameters returns, handling calls, nonexistent analysis. shown use (SSA form) these deficiencies overcome. SSA assists with: data How particularly propagation; identification without assuming ABI compliance; (whether a location preserved across call), which needed analysing return locations; implemented as sparse flow problem; Expression propagation key element decompiler, since allows long sequences individual instruction semantics combined into more complex, high level statements. Parameters, types features languages do appear explicitly programs, hence their readability ability recompile generated code. In addition, either absent limited relatively simple library function The calls finding program, translation program elements such switch statements, assigned gotos, virtual pointers. Because challenges, most interesting case. weak at identifying where passed registers, calling convention non standard. A general returns demonstrated, using new devices Collectors. analyses become complex presence recursion. elimination redundant global analyses, implying procedures finalised until other analysed. Full discussed, instructions, well information contribute solution. sparse, iterative, based approach compared with common constraint approach. former requires special functions handle multiple constraints result overloaded operators addition subtraction. Special problems arise aggregate (arrays structures), address taking variables. Indirect branch instructions often handled decode time. Delaying represented powerful techniques used. This results simpler, cost having throwaway restart analyses. this technique easily extends Fortran effectively analysed call potential enabling object oriented Many presented thesis been verified Boomerang open decompiler. goal extending state art decompilation achieved. course still areas left future work. promising research identified range alias