The BlogVox Opinion Retrieval System

作者: Pranam Kolari , Akshay Java , Tim Finin , Justin Martineau , Anupam Joshi

DOI:

关键词: Rank (computer programming)Information retrievalSet (abstract data type)Spam blogComputer science

摘要: The BlogVox system retrieves opinionated blog posts specified by ad hoc queries. was developed for the 2006 TREC track University of Maryland, Baltimore County and Johns Hopkins Applied Physics Laboratory using a novel to recognize legitimate discriminate against spam blogs. It also processes eliminate extraneous non-content, including blog-rolls, link-rolls, advertisements sidebars. After retrieving relevant topic query, them produce set independent features estimating likelihood that post expresses an opinion about topic. These are combined SVM-based integrated with relevancy score rank results. We evaluate BlogVox's performance human assessors. individual splog filtering non-content removal components BlogVox.

参考文章(18)
Kamal Nigam, Matthew Hurst, Towards a Robust Metric of Opinion ,(2004)
Pranam Kolari, Akshay Java, Tim Finin, Anupam Joshi, Tim Oates, Detecting spam blogs: a machine learning approach national conference on artificial intelligence. pp. 1351- 1356 ,(2006) , 10.13016/M27M0444D
I. Ounis, C. Macdonald, The TREC Blogs06 Collection: Creating and Analysing a Blog Test Collection Dept of Computing Science, University of Glasgow. ,(2006)
Pranam Kolari, Tim Finin, Anupam Joshi, SVMs for the Blogosphere: Blog Identification and Splog Detection national conference on artificial intelligence. pp. 92- 99 ,(2006)
Bing Liu, Lan Yi, Web page cleaning for web mining through feature weighting international joint conference on artificial intelligence. pp. 43- 48 ,(2003)
Otis Gospodnetić, Erik Hatcher, Doug Cutting, Lucene in Action ,(2004)
Gilad Mishne, Natalie S. Glance, Predicting Movie Sales from Blogger Sentiment national conference on artificial intelligence. pp. 155- 158 ,(2006)
Zoltán Gyöngyi, Hector Garcia-Molina, Jan Pedersen, Combating web spam with trustrank very large data bases. pp. 576- 587 ,(2004) , 10.1016/B978-012088469-8.50052-8
Lan Yi, Bing Liu, Xiaoli Li, None, Eliminating noisy information in Web pages for data mining knowledge discovery and data mining. pp. 296- 305 ,(2003) , 10.1145/956750.956785
Kushal Dave, Steve Lawrence, David M. Pennock, Mining the peanut gallery Proceedings of the twelfth international conference on World Wide Web - WWW '03. pp. 519- 528 ,(2003) , 10.1145/775152.775226