Authorship Attribution of Android Apps

作者: Hugo Gonzalez , Natalia Stakhanova , Ali A. Ghorbani

DOI: 10.1145/3176258.3176322

关键词: World Wide WebARPANETComputer virusMalwareSource codeWriting styleAndroid (operating system)Adversarial systemAdversaryComputer science

摘要: Since the first computer virus hit Advanced Research Projects Agency Network (ARPANET) in early 1970s, security community interest revolved around ways to expose identities of malware writers. Knowledge adversarial promised additional leverage experts their ongoing battle against those perpetrators. At dawn computing era, when writers and malicious software were characterized by lack experience relative simplicity, task uncovering was more or less straightforward. Manual analysis source code often revealed personal, identifiable information embedded authors themselves. But these times have long gone. Modern day's extensively use numerous generators mass produce new variants employ advanced obfuscation techniques hide identities. As a result work trying uncover became significantly challenging time consuming. To gain insight into identity an adversary, we turn our attention authorship attribution research, which offers broad spectrum for identifying author document, based on author's writing style. Equipped with methods, explore Android binaries role features related development process determination binary authorship. Within this context, propose incremental approach perform apps. First set known then generation profiles unknown We assess effectiveness several sets legitimate produced actual developers, as opposed using artificially created authors' data. achieve 97.5% accuracy authors» further evaluate than 131,000 apps collected from various sources including 10 different markets globe.

参考文章(12)
Rachel Greenstadt, Richard Harang, Clare Voss, Arvind Narayanan, Fabian Yamaguchi, Aylin Caliskan-Islam, Andrew Liu, De-anonymizing programmers via code stylometry usenix security symposium. pp. 255- 270 ,(2015)
Elaine Ribeiro de Faria, Isabel Ribeiro Goncalves, Jo ao Gama, Andre Carlos Ponce de Leon Ferreira Carvalho, Evaluation of Multiclass Novelty Detection Algorithms for Data Streams IEEE Transactions on Knowledge and Data Engineering. ,vol. 27, pp. 2961- 2973 ,(2015) , 10.1109/TKDE.2015.2441713
Dong-Kyu Chae, Sang-Wook Kim, Jiwoon Ha, Sang-Chul Lee, Gyun Woo, Software plagiarism detection via the static API call frequency birthmark acm symposium on applied computing. pp. 1639- 1643 ,(2013) , 10.1145/2480362.2480668
Benno Stein, Nedim Lipka, Peter Prettenhofer, Intrinsic plagiarism analysis language resources and evaluation. ,vol. 45, pp. 63- 82 ,(2011) , 10.1007/S10579-010-9115-Y
Steven Burrows, Alexandra L. Uitdenbogerd, Andrew Turpin, Comparing techniques for authorship attribution of source code Software - Practice and Experience. ,vol. 44, pp. 1- 32 ,(2014) , 10.1002/SPE.2146
Zhi Wang, Xuxian Jiang, Weidong Cui, Peng Ning, Countering kernel rootkits with lightweight hook protection computer and communications security. pp. 545- 554 ,(2009) , 10.1145/1653662.1653728
Eugene H. Spafford, Computer viruses as artificial life Artificial Life. ,vol. 1, pp. 249- 265 ,(1994) , 10.1162/ARTL.1994.1.249
Efstathios Stamatatos, A survey of modern authorship attribution methods Journal of the Association for Information Science and Technology. ,vol. 60, pp. 538- 556 ,(2009) , 10.1002/ASI.V60:3
Saed Alrabaee, Noman Saleem, Stere Preda, Lingyu Wang, Mourad Debbabi, OBA2: An Onion approach to Binary code Authorship Attribution Digital Investigation. ,vol. 11, pp. S94- S103 ,(2014) , 10.1016/J.DIIN.2014.03.012
Radhouane Chouchane, Natalia Stakhanova, Andrew Walenstein, Arun Lakhotia, Detecting machine-morphed malware variants via engine attribution Journal of Computer Virology and Hacking Techniques. ,vol. 9, pp. 137- 157 ,(2013) , 10.1007/S11416-013-0183-6