作者: George Danezis , Nadia Alshahwan , Earl T. Barr , David Clark
DOI:
关键词:
摘要: This work focuses on a specific front of the malware detection arms-race, namely persistent, disk-resident malware. We exploit normalised compression distance (NCD), an information theoretic measure, applied directly to binaries. Given zoo labelled and benign-ware, we ask whether suspect program is more similar our or benign-ware. Our approach classifies with 97.1% accuracy false positive rate 3%. achieve results off-the-shelf compressors standard machine learning classifier without any specialised knowledge. An end-user need only collect benign-ware then can immediately apply techniques. We statistical rigour experiments selection data. demonstrate that be optimised by combining NCD compressibility rates executables. reported within narrow time frame few days homogenous than over longer one two years but method still latter 95.2% 5% rate. Due use compression, computation cost non-trivial. show simple approximation techniques improve complexity up 63%. compare applying 59 anti-malware programs used VirusTotal web site does better single them as well collectively.