作者: Kent Griffin , Scott Schneider , Xin Hu , Tzi-cker Chiueh
DOI: 10.1007/978-3-642-04342-0_6
关键词:
摘要: Scanning files for signatures is a proven technology, but exponential growth in unique malware programs has caused an explosion signature database sizes. One solution to this problem use string , each of which contiguous byte sequence that potentially can match many variants family. However, it not clear how automatically generate these with sufficiently low false positive rate. Hancock the first generation system takes on challenge large scale. To minimize rate, features scalable model estimates occurrence probability arbitrary sequences goodware programs, set library code identification techniques, and diversity-based heuristics ensure contexts embedded containing are similar one another. With techniques combined, able rate below 0.1%.