Adaptive Grey-Box Fuzz-Testing with Thompson Sampling

作者: Siddharth Karamcheti , Gideon Mann , David Rosenberg

DOI: 10.1145/3270101.3270108

关键词:

摘要: Fuzz testing, or "fuzzing," refers to a widely deployed class of techniques for testing programs by generating set inputs the express purpose finding bugs and identifying security flaws. Grey-box fuzzing, most popular fuzzing strategy, combines light program instrumentation with data driven process generate new inputs. In this work, we present machine learning approach that builds on AFL, preeminent grey-box fuzzer, adaptively probability distribution over its mutation operators program-specific basis. These operators, which are selected uniformly at random in AFL mutational fuzzers general, dictate how generated, core part fuzzer's efficacy. Our main contributions two-fold: First, show sampling estimated from training can significantly improve performance AFL. Second, introduce Thompson Sampling, bandit-based optimization fine-tunes mutator adaptively, during course an individual outperforms offline training. A experiments across complex demonstrates tuning operator generates sets yield higher code coverage finds more crashes faster reliably than both baseline versions as well other AFL-based approaches.

参考文章(27)
Christopher J. C. H. Watkins, Peter Dayan, Technical Note : \cal Q -Learning Machine Learning. ,vol. 8, pp. 279- 292 ,(1992) , 10.1007/BF00992698
Kin-Keung Ma, Khoo Yit Phang, Jeffrey S. Foster, Michael Hicks, Directed symbolic execution static analysis symposium. pp. 95- 111 ,(2011) , 10.1007/978-3-642-23702-7_11
David A. Molnar, Michael Y. Levin, Patrice Godefroid, Automated Whitebox Fuzz Testing. network and distributed system security symposium. ,(2008)
Alexandre Rebert, David Brumley, Thanassis Avgerinos, Gustavo Grieco, Sang Kil Cha, Jonathan Foote, David Warren, Optimizing seed selection for fuzzing usenix security symposium. pp. 861- 875 ,(2014)
Sang Kil Cha, Maverick Woo, David Brumley, Program-Adaptive Mutational Fuzzing 2015 IEEE Symposium on Security and Privacy. pp. 725- 741 ,(2015) , 10.1109/SP.2015.50
Cristian Cadar, Daniel Dunbar, Dawson Engler, KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs operating systems design and implementation. pp. 209- 224 ,(2008) , 10.5555/1855741.1855756
Shipra Agrawal, Navin Goyal, Analysis of Thompson Sampling for the Multi-armed Bandit Problem conference on learning theory. ,(2012)
Barton P. Miller, Louis Fredriksen, Bryan So, An empirical study of the reliability of UNIX utilities Communications of The ACM. ,vol. 33, pp. 32- 44 ,(1990) , 10.1145/96267.96279
Patrice Godefroid, Michael Y. Levin, David Molnar, SAGE Communications of the ACM. ,vol. 55, pp. 40- 44 ,(2012) , 10.1145/2093548.2093564
Maverick Woo, Sang Kil Cha, Samantha Gottlieb, David Brumley, Scheduling black-box mutational fuzzing computer and communications security. pp. 511- 522 ,(2013) , 10.1145/2508859.2516736