Web Spam Detection Using MapReduce Approach to Collective Classification

作者: Wojciech Indyk , Tomasz Kajdanowicz , Przemyslaw Kazienko , Slawomir Plamowski

DOI: 10.1007/978-3-642-33018-6_20

关键词:

摘要: The web spam detection problem was considered in the paper. Based on interconnected and no-spam hosts a collective classification approach based label propagation is aimed at discovering hosts. Each host represented as network node links between constitute network’s edges. proposed method provides reasonable results able to compute large data settled MapReduce programming model.

参考文章(23)
Wojciech Indyk, Tomasz Kajdanowicz, Przemysław Kazienko, Sławomir Plamowski, MapReduce approach to collective classification for networks international conference on artificial intelligence and soft computing. pp. 656- 663 ,(2012) , 10.1007/978-3-642-29347-4_76
Pranam Kolari, Akshay Java, Tim Finin, Anupam Joshi, Tim Oates, Detecting spam blogs: a machine learning approach national conference on artificial intelligence. pp. 1351- 1356 ,(2006) , 10.13016/M27M0444D
Károly Csalogány, András A. Benczúr, Tamás Sarlós, Máté Uher, SpamRank -- Fully Automatic Link Spam Detection. adversarial information retrieval on the web. pp. 25- 38 ,(2005)
Hui Zhang, Ashish Goel, Ramesh Govindan, Kahn Mason, Benjamin Van Roy, Making Eigenvector-Based Reputation Systems Robust to Collusion workshop on algorithms and models for the web graph. pp. 92- 104 ,(2004) , 10.1007/978-3-540-30216-2_8
Zoltán Gyöngyi, Hector Garcia-Molina, Jan Pedersen, Combating web spam with trustrank very large data bases. pp. 576- 587 ,(2004) , 10.1016/B978-012088469-8.50052-8
Rajeev Motwani, Terry Winograd, Lawrence Page, Sergey Brin, The PageRank Citation Ranking : Bringing Order to the Web the web conference. ,vol. 98, pp. 161- 172 ,(1999)
Artificial Intelligence and Soft Computing Lecture Notes in Computer Science. ,vol. 6113, ,(2010) , 10.1007/978-3-642-13208-7
Dennis Fetterly, Mark Manasse, Marc Najork, Spam, damn spam, and statistics: using statistical analysis to locate spam web pages international workshop on the web and databases. pp. 1- 6 ,(2004) , 10.1145/1017074.1017077
Alexandros Ntoulas, Marc Najork, Mark Manasse, Dennis Fetterly, Detecting spam web pages through content analysis Proceedings of the 15th international conference on World Wide Web - WWW '06. pp. 83- 92 ,(2006) , 10.1145/1135777.1135794
André Luiz da Costa Carvalho, Paul - Alexandru Chirita, Edleno Silva de Moura, Pável Calado, Wolfgang Nejdl, Site level noise removal for search engines Proceedings of the 15th international conference on World Wide Web - WWW '06. pp. 73- 82 ,(2006) , 10.1145/1135777.1135793