Pattern-Based Vulnerability Discovery

作者: Fabian Yamaguchi

DOI:

关键词: Computer scienceStatic program analysisSecure codingSource codeCode (cryptography)Data flow diagramData miningControl flowUnsupervised learningCode review

摘要: With our increasing reliance on the correct functioning of computer systems, identifying and eliminating vulnerabilities in program code is gaining importance. To date, the vast majority these flaws are found by tedious manual auditing conducted by experienced security analysts. Unfortunately, a single missed flaw can suffice for an attacker to fully compromise system, and thus, sheer amount plays into the attacker’s cards. On defender’s side, this creates persistent demand methods that assist discovery at scale. This thesis introduces pattern-based vulnerability discovery, novel approach identifying which combines techniques from static analysis, machine learning, and graph mining augment analyst’s abilities rather than trying replace her. The main idea leverage patterns narrow potential vulnerabilities, where may be formulated manually, derived from the history, or inferred directly. We base novel architecture robust analysis source that enables large amounts be mined via traversals property graph, joint representation of program’s syntax, control flow, data flow. While useful identify occurrences of manually defined its own right, we proceed show platform offers rich automatically discovering exposing code. To this end, develop different vectorial representations based symbols, trees, graphs, allowing it processed with learning algorithms. Ultimately, us devise three unique vulnerability discovery, each address task encountered day-to-day by exploiting capabilities unsupervised methods. In particular, present method similar known vulnerability, uncover missing checks linked critical objects, and finally, closes loop generating code analysis platform explicitly express store vulnerable programming patterns. empirically evaluate methods popular widely-used open source projects, both controlled settings real world audits. In controlled settings, find all considerably reduce needs to inspected. audits, allow expose many previously unknown often including VLC media player, instant messenger Pidgin, Linux kernel.

参考文章(126)
Olaf Hartig, Reconciliation of RDF* and Property Graphs. arXiv: Databases. ,(2014)
Hao Zhong, Tao Xie, Lu Zhang, Jian Pei, Hong Mei, MAPO: Mining and Recommending API Usage Patterns european conference on object oriented programming. pp. 318- 343 ,(2009) , 10.1007/978-3-642-03013-0_15
Tielei Wang, Zhiqiang Lin, Tao Wei, Wei Zou, IntScope: Automatically Detecting Integer Overflow Vulnerability in X86 Binary Using Symbolic Execution. network and distributed system security symposium. ,(2009)
Konrad Rieck, Fabian Yamaguchi, Felix Lindner, Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning WOOT'11 Proceedings of the 5th USENIX conference on Offensive technologies. pp. 13- 13 ,(2011)
Nikos Karampatziakis, Jack W. Stokes, Anil Thomas, Mady Marinescu, Using file relationships in malware classification international conference on detection of intrusions and malware and vulnerability assessment. pp. 1- 20 ,(2012) , 10.1007/978-3-642-37300-8_1
Julien Vanegue, Rolf Rolles, Sean Heelan, SMT solvers for software security WOOT'12 Proceedings of the 6th USENIX conference on Offensive Technologies. pp. 9- 9 ,(2012)
Alexandre Rebert, David Brumley, Thanassis Avgerinos, Gustavo Grieco, Sang Kil Cha, Jonathan Foote, David Warren, Optimizing seed selection for fuzzing usenix security symposium. pp. 861- 875 ,(2014)
Rachel Greenstadt, Richard Harang, Clare Voss, Arvind Narayanan, Fabian Yamaguchi, Aylin Caliskan-Islam, Andrew Liu, De-anonymizing programmers via code stylometry usenix security symposium. pp. 255- 270 ,(2015)
Fabio Roli, Blaine Nelson, Anthony D. Joseph, Pavel Laskov, J. Doug Tygar, Machine Learning Methods for Computer Security (Dagstuhl Perspectives Workshop 12371) Dagstuhl Manifestos. ,vol. 2, pp. 130- ,(2013) , 10.4230/DAGREP.2.9.109