Web spam identification through content and hyperlinks

作者: Jacob Abernethy , Olivier Chapelle , Carlos Castillo

DOI: 10.1145/1451983.1451994

关键词:

摘要: We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits structure of Web graph as well page contents and features. The method is efficient, scalable, provides state-of-the-art accuracy a standard benchmark.

参考文章(15)
Zoltán Gyöngyi, Hector Garcia-Molina, Jan Pedersen, Combating web spam with trustrank very large data bases. pp. 576- 587 ,(2004) , 10.1016/B978-012088469-8.50052-8
Stephanie W. Haas, Erika S. Grams, Page and link classifications: connecting diverse resources acm international conference on digital libraries. pp. 99- 107 ,(1998) , 10.1145/276675.276686
Brian D. Davison, Topical locality in the Web international acm sigir conference on research and development in information retrieval. pp. 272- 279 ,(2000) , 10.1145/345508.345597
Alexandros Ntoulas, Marc Najork, Mark Manasse, Dennis Fetterly, Detecting spam web pages through content analysis Proceedings of the 15th international conference on World Wide Web - WWW '06. pp. 83- 92 ,(2006) , 10.1145/1135777.1135794
Carlos Castillo, Debora Donato, Luca Becchetti, Paolo Boldi, Stefano Leonardi, Massimo Santini, Sebastiano Vigna, A reference collection for web spam international acm sigir conference on research and development in information retrieval. ,vol. 40, pp. 11- 24 ,(2006) , 10.1145/1189702.1189703
Jonathan R Shewchuk, An Introduction to the Conjugate Gradient Method Without the Agonizing Pain Carnegie Mellon University. ,(1994)
Dengyong Zhou, Christopher JC Burges, Tao Tao, None, Transductive link spam detection Proceedings of the 3rd international workshop on Adversarial information retrieval on the web - AIRWeb '07. pp. 21- 28 ,(2007) , 10.1145/1244408.1244413
Qingqing Gan, Torsten Suel, Improving web spam classifiers using link structure Proceedings of the 3rd international workshop on Adversarial information retrieval on the web - AIRWeb '07. pp. 17- 20 ,(2007) , 10.1145/1244408.1244412
Vijay Krishnan, Rashmi Raj, Web Spam Detection with Anti-Trust Rank adversarial information retrieval on the web. pp. 37- 40 ,(2006)
Carlos Castillo, Debora Donato, Aristides Gionis, Vanessa Murdock, Fabrizio Silvestri, Know your neighbors: web spam detection using the web topology international acm sigir conference on research and development in information retrieval. pp. 423- 430 ,(2007) , 10.1145/1277741.1277814