Experimental challenges in cyber security: a story of provenance and lineage for malware

作者: Tudor Dumitras , Iulian Neamtiu

DOI:

关键词: Computer scienceHoneypotEmpirical researchMetadataSoftware evolutionProcess (engineering)Data collectionComputer securityMalwareCluster analysis

摘要: Rigorous experiments and empirical studies hold the promise of empowering researchers practitioners to develop better approaches for cyber security. For example, understanding provenance lineage polymorphic malware strains can lead new techniques detecting classifying unknown attacks. Unfortunately, many challenges stand in way: lack sufficient field data (e.g., samples contextual information about their impact real world), metadata collection process existing sets, ground truth, difficulty developing tools methods rigorous analysis. As a first step towards experimental methods, we introduce two reconstructing phylogenetic trees dynamic control-flow graphs binaries, inspired from research software evolution, bioinformatics time series analysis. Our approach is based on observation that long evolution histories open source projects provide an opportunity creating precise models provenance, which be used clustering as well. As second step, present combine use representative corpus (gathered end hosts rather than network traces or honeypots) with sound analysis techniques. While our serve concrete purpose-- provenance--they also general blueprint addressing threats validity security studies.

参考文章(21)
David J. DeWitt, The Wisconsin Benchmark: Past, Present, and Future. The Benchmark Handbook. pp. 119- 165 ,(1991)
Harpreet S. Sawhney, King-Ip Lin, Kyuseok Shim, Rakesh Agrawal, Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases very large data bases. pp. 490- 501 ,(1995)
Peng Li, Limin Liu, Debin Gao, Michael K. Reiter, On challenges in evaluating malware clustering recent advances in intrusion detection. ,vol. 6307, pp. 238- 255 ,(2010) , 10.1007/978-3-642-15512-3_13
R.P. Lippmann, D.J. Fried, I. Graf, J.W. Haines, K.R. Kendall, D. McClung, D. Weber, S.E. Webster, D. Wyschogrod, R.K. Cunningham, M.A. Zissman, Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation darpa information survivability conference and exposition. ,vol. 2, pp. 12- 26 ,(2000) , 10.1109/DISCEX.2000.821506
T.J. McCabe, A Complexity Measure IEEE Transactions on Software Engineering. ,vol. SE-2, pp. 308- 320 ,(1976) , 10.1109/TSE.1976.233837
John McHugh, Testing Intrusion detection systems ACM Transactions on Information and System Security. ,vol. 3, pp. 262- 294 ,(2000) , 10.1145/382912.382923
Bill Chiu, Eamonn Keogh, Stefano Lonardi, Probabilistic discovery of time series motifs knowledge discovery and data mining. pp. 493- 498 ,(2003) , 10.1145/956750.956808
D.A. Menasce, TPC-W: a benchmark for e-commerce IEEE Internet Computing. ,vol. 6, pp. 83- 87 ,(2002) , 10.1109/MIC.2002.1003136
Clemente Izurieta, James Bieman, The evolution of FreeBSD and linux international symposium on empirical software engineering. pp. 204- 211 ,(2006) , 10.1145/1159733.1159765
Corrado Leita, Ulrich Bayer, Engin Kirda, Exploiting diverse observation perspectives to get insights on the malware landscape dependable systems and networks. pp. 393- 402 ,(2010) , 10.1109/DSN.2010.5544291