Towards quantifying visual similarity of domain names for combating typosquatting abuse

作者: Tingwen Liu , Yang Zhang , Jinqiao Shi , Ya Jing , Quangang Li

DOI: 10.1109/MILCOM.2016.7795422

关键词:

摘要: Typosquatting becomes a speculative and serious phenomenon for both Internet users brand owners of popular websites. Typosquatters register similar domain names websites to profit from displaying advertisements, redirecting traffic third-party pages, deploying phishing sites, or serving malware. Thus, much work have been done on measuring typosquatting in distribution, monetization cost etc. This paper does not measure typosquatting, but tries combat abuse the abnormal detection view: that looks very like one website is suspicious. We propose TypoPegging, reverse lookup approach quickly accurately get most given domain. Specifically, we novel quantitative method visual similarity two domains. The proposed based generalized Levenshtein distance takes insights our characteristics. Then give an efficient search maximum over set. accelerate searching process triangle inequality metric locality sensitive hashing algorithm. Preliminary results show effective differentiating normal ones. can also speedup orders magnitude comparing with linear method.

参考文章(12)
Chad Verbowski, Jeffrey Wang, Yi-Min Wang, Doug Beck, Brad Daniels, Strider typo-patrol: discovery and analysis of systematic typo-squatting conference on steps to reducing unwanted traffic on internet. pp. 5- 5 ,(2006)
Mohammad Taha Khan, Xiang Huo, Zhou Li, Chris Kanich, Every Second Counts: Quantifying the Negative Externalities of Cybercrime via Typosquatting 2015 IEEE Symposium on Security and Privacy. pp. 135- 150 ,(2015) , 10.1109/SP.2015.16
Mark Felegyhazi, Chris Kanich, Jonathan Spring, Balazs Kocso, Janos Szurdi, Gabor Cseh, The long Taile of typosquatting domain names usenix security symposium. pp. 191- 206 ,(2014)
Pieter Agten, Wouter Joosen, Frank Piessens, Nick Nikiforakis, Seven Months' Worth of Mistakes: A Longitudinal Study of Typosquatting Abuse network and distributed system security symposium. ,(2015) , 10.14722/NDSS.2015.23058
Tristan Halvorson, Kirill Levchenko, Stefan Savage, Geoffrey M. Voelker, XXXtortion?: inferring registration intent in the .XXX TLD the web conference. pp. 901- 912 ,(2014) , 10.1145/2566486.2567995
Thomas Vissers, Wouter Joosen, Nick Nikiforakis, Parking sensors: Analyzing and detecting parked domains network and distributed system security symposium. pp. 53- 53 ,(2015) , 10.14722/NDSS.2015.23053
M. Slaney, M. Casey, Locality-Sensitive Hashing for Finding Nearest Neighbors [Lecture Notes] IEEE Signal Processing Magazine. ,vol. 25, pp. 128- 131 ,(2008) , 10.1109/MSP.2007.914237
Guanchen Chen, Matthew F. Johnson, Pavan R. Marupally, Naveen K. Singireddy, Xin Yin, Vamsi Paruchuri, Combating Typo-Squatting for Safer Browsing advanced information networking and applications. pp. 31- 36 ,(2009) , 10.1109/WAINA.2009.98
Malcolm Slaney, Michael Casey, Locality-Sensitive Hashing for Finding Nearest Neighbors ,(2008)
Yushi Jing, S. Baluja, VisualRank: Applying PageRank to Large-Scale Image Search IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 30, pp. 1877- 1890 ,(2008) , 10.1109/TPAMI.2008.121