作者: Fabian Yamaguchi
DOI:
关键词: Computer science 、 Static program analysis 、 Secure coding 、 Source code 、 Code (cryptography) 、 Data flow diagram 、 Data mining 、 Control flow 、 Unsupervised learning 、 Code review
摘要: With our increasing reliance on the correct functioning of computer systems, identifying and eliminating vulnerabilities in program code is gaining importance. To date, the vast majority these flaws are found by tedious manual auditing conducted by experienced security analysts. Unfortunately, a single missed flaw can suffice for an attacker to fully compromise system, and thus, sheer amount plays into the attacker’s cards. On defender’s side, this creates persistent demand methods that assist discovery at scale. This thesis introduces pattern-based vulnerability discovery, novel approach identifying which combines techniques from static analysis, machine learning, and graph mining augment analyst’s abilities rather than trying replace her. The main idea leverage patterns narrow potential vulnerabilities, where may be formulated manually, derived from the history, or inferred directly. We base novel architecture robust analysis source that enables large amounts be mined via traversals property graph, joint representation of program’s syntax, control flow, data flow. While useful identify occurrences of manually defined its own right, we proceed show platform offers rich automatically discovering exposing code. To this end, develop different vectorial representations based symbols, trees, graphs, allowing it processed with learning algorithms. Ultimately, us devise three unique vulnerability discovery, each address task encountered day-to-day by exploiting capabilities unsupervised methods. In particular, present method similar known vulnerability, uncover missing checks linked critical objects, and finally, closes loop generating code analysis platform explicitly express store vulnerable programming patterns. empirically evaluate methods popular widely-used open source projects, both controlled settings real world audits. In controlled settings, find all considerably reduce needs to inspected. audits, allow expose many previously unknown often including VLC media player, instant messenger Pidgin, Linux kernel.