A search system for mathematical expressions on software binaries

作者: Ridhi Jain , Sai Prathik , Venkatesh Vinayakarao , Rahul Purandare

DOI: 10.1145/3196398.3196413

关键词:

摘要: Developers often ask for libraries that implement specific mathematical expressions. A fundamental bottleneck in building information retrieval (IR) systems to answer such queries is the inability detect a given expression software binaries. While we have few math IR solutions as EgoMath2 and Tangent-3 work over text documents, none exist search Our vision build system binaries containing wide variety of compilers differences way they optimize code, pose difficult challenges solve this problem. In work, discuss our preliminary results detecting expressions We use knowledge base assisted approach are able with precision 80% recall 53%. This opens up interesting research opportunities areas security performance, help analysts identifying analyzing implementations

参考文章(19)
Jozef Mišutka, Leo Galamboš, System description: EgoMath2 as a tool for mathematical searching on wikipedia.org MKM'11 Proceedings of the 18th Calculemus and 10th international conference on Intelligent computer mathematics. pp. 307- 309 ,(2011) , 10.1007/978-3-642-22673-1_30
Zhenkai Liang, Juan Caballero, Dawn Song, David Brumley, James Newsome, Towards automatic discovery of deviations in binary implementations with applications to error detection and fingerprint generation usenix security symposium. pp. 15- ,(2007)
Robert Miner, Rajesh Munavalli, An Approach to Mathematical Search Through Query Formulation and Data Normalization Calculemus '07 / MKM '07 Proceedings of the 14th symposium on Towards Mechanized Mathematical Assistants: 6th International Conference. pp. 342- 355 ,(2007) , 10.1007/978-3-540-73086-6_27
Lannan Luo, Jiang Ming, Dinghao Wu, Peng Liu, Sencun Zhu, Semantics-based obfuscation-resilient binary code similarity comparison with applications to software plagiarism detection foundations of software engineering. pp. 389- 400 ,(2014) , 10.1145/2635868.2635900
Jiyong Jang, David Brumley, Shobha Venkataraman, BitShred Proceedings of the 18th ACM conference on Computer and communications security - CCS '11. pp. 309- 320 ,(2011) , 10.1145/2046707.2046742
Shahab Kamali, Frank Wm. Tompa, Retrieving documents with mathematical content international acm sigir conference on research and development in information retrieval. pp. 353- 362 ,(2013) , 10.1145/2484028.2484083
Andreas Sæbjørnsen, Jeremiah Willcock, Thomas Panas, Daniel Quinlan, Zhendong Su, Detecting code clones in binary executables Proceedings of the eighteenth international symposium on Software testing and analysis - ISSTA '09. pp. 117- 128 ,(2009) , 10.1145/1572272.1572287
O. Caprotti, D. Carlisle, OpenMath and MathML: semantic markup for mathematics ACM Crossroads Student Magazine. ,vol. 6, pp. 11- 14 ,(1999) , 10.1145/333104.333110
Tam T. Nguyen, Kuiyu Chang, Siu Cheung Hui, A math-aware search engine for math question answering system Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12. pp. 724- 733 ,(2012) , 10.1145/2396761.2396854
Rajesh Munavalli, Robert Miner, MathFind Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06. pp. 735- 735 ,(2006) , 10.1145/1148170.1148348