作者: Pranam Kolari , Akshay Java , Tim Finin , Justin Martineau , Anupam Joshi
DOI:
关键词: Rank (computer programming) 、 Information retrieval 、 Set (abstract data type) 、 Spam blog 、 Computer science
摘要: The BlogVox system retrieves opinionated blog posts specified by ad hoc queries. was developed for the 2006 TREC track University of Maryland, Baltimore County and Johns Hopkins Applied Physics Laboratory using a novel to recognize legitimate discriminate against spam blogs. It also processes eliminate extraneous non-content, including blog-rolls, link-rolls, advertisements sidebars. After retrieving relevant topic query, them produce set independent features estimating likelihood that post expresses an opinion about topic. These are combined SVM-based integrated with relevancy score rank results. We evaluate BlogVox's performance human assessors. individual splog filtering non-content removal components BlogVox.