Methods for ranking user-generated text streams

作者: Mostafa Keikha

DOI:

关键词: User-generated contentInformation needsTemporal informationRanking (information retrieval)State (computer science)Term (time)Task (project management)Computer scienceInformation retrievalRank (computer programming)

摘要: User generated content are one of the main sources information on Web nowadays. With huge amount this type data being everyday, having an efficient and effective retrieval system is essential. The goal such a to enable users search through retrieve documents relevant their needs. Among different tasks user content, retrieving ranking streams important ones that has verious applications. task rank streams, as collections with chronological order, in response query. This than traditional where single temporal properties less ranking. In thesis we investigate problem user-generated case study blog feed retrieval. Blogs, like all other have specific require new considerations methods. Blog can be defined blogs recurrent interest topic given We define three each which introduces challenges task. These include: 1) term mismatch retrieval, 2) evolution topics 3) diversity posts. For these properties, its corresponding propose solutions overcome those challenges. further analyze effect our performance system. show taking into account for developing help us improve state art proposed methods, specifically pay attention believe any streams. when combined content-based information, useful situations. Although apply methods they mostly general applicable similar stream problems experts or twitter users.

参考文章(92)
Amit Singhal, Modern Information Retrieval : A Brief Overview IEEE Data(base) Engineering Bulletin. ,vol. 24, pp. 35- 43 ,(2001)
Nick Craswell, Ian Soboroff, Arjen P. de Vries, Overview of the TREC 2006 Enterprise Track. text retrieval conference. ,(2006)
Matt Schmill, Victor Lavrenko, David Jensen, Dawn Lawrie, Paul Ogilvie, Mining of Concurrent Text and Time Series ,(2008)
Iadh Ounis, Ian Soboroff, Craig Macdonald, Overview of the TREC-2007 Blog Track text retrieval conference. ,(2007)
I. Ounis, C. Macdonald, The TREC Blogs06 Collection: Creating and Analysing a Blog Test Collection Dept of Computing Science, University of Glasgow. ,(2006)
Gilad Mishne, Maarten de Rijke, A Study of Blog Search Lecture Notes in Computer Science. ,vol. 3936, pp. 289- 301 ,(2006) , 10.1007/11735106_26
Iadh Ounis, Maarten de Rijke, Gilad Mishne, Ian Soboroff, Craig Macdonald, Overview of the TREC 2006 Blog Track text retrieval conference. pp. 15- 27 ,(2006)
Hinrich Schütze, Christopher D. Manning, Prabhakar Raghavan, Introduction to Information Retrieval ,(2005)
W. S. Cooper, S. E. Robertson, M. E. Maron, The unified probabilistic model for IR international acm sigir conference on research and development in information retrieval. pp. 108- 117 ,(1982) , 10.5555/636713.636723
D. Hiemstra, H. Rode, F.M.G. de Jong, Temporal Language Models for the Disclosure of Historical Text Humanities, computers and cultural heritage: XVIth International Conference of the Association for History and Computing (AHC 2005). pp. 161- 168 ,(2005)