作者: Stefan Brunthaler , Michael Franz , Per Larsen , Stephen Crane , Mathias Payer
DOI:
关键词: Source code 、 Computer science 、 Code (cryptography) 、 Theoretical computer science 、 Malware 、 Set (abstract data type) 、 Data mining 、 Similarity (network science) 、 Software
摘要: Similarity metrics, e.g., signatures as used by anti-virus products, are the dominant technique to detect if a given binary is malware. The underlying assumption of this approach that all instances malware (or even family) will be similar each other. Software diversification probabilistic uses code and data randomization expressiveness in target instruction set generate large amounts functionally equivalent but different binaries. Malware diversity builds on software ensures any two diversified same have low similarity (according metrics). An LLVM-based prototype implementation diversifies both binaries our evaluation shows based only match one or few pool generated from source code.