Effective ranking with arbitrary passages

作者: Marcin Kaszkiel , Justin Zobel

DOI: 10.1002/1532-2890(2000)9999:9999<::AID-ASI1075>3.3.CO;2-R

关键词:

摘要: Text retrieval systems store a great variety of documents, from abstracts, newspaper articles, and Web pages to journal books, court transcripts, legislation. Collections diverse types documents expose shortcomings in current approaches ranking. Use short fragments called passages, instead whole can overcome these shortcomings: passage ranking provides convenient units text return the user, avoid difficulties comparing different length, enables identification blocks relevant material among otherwise irrelevant text. In this article, we compare several kinds an extensive series experiments. We introduce new type passage, overlapping either fixed or variable length. show that with arbitrary passages gives substantial improvements effectiveness over traditional document schemes, particularly for queries on collections long documents. Ranking shows consistent compared previous depend structure topic shifts

参考文章(48)
Charles L. A. Clarke, Gordon V. Cormack, Christopher R. Palmer, Samuel S. L. To, Passage-Based Refinement (MultiText Experiements for TREC-6). text retrieval conference. pp. 303- 319 ,(1997)
Jamie Callan, Passage-retrieval evidence in document retrieval international acm sigir conference on research and development in information retrieval. ,(1994)
Micheline Hancock-Beaulieu, Stephen E. Robertson, Steve Walker, Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive. text retrieval conference. pp. 199- 210 ,(1998)
Owen de Kretser, Alistair Moffat, Locality-Based Information Retrieval. australasian database conference. pp. 177- 188 ,(1999)
Charles L. A. Clarke, Gordon V. Cormack, Christopher R. Palmer, Michael Van Biesbrouck, Deriving Very Short Queries for High Precision and Recall (MultiText Experiments for TREC-7). text retrieval conference. pp. 68- 79 ,(1998)
David L. Waltz, Craig Stanfill, Statistical methods, artificial intelligence, and information retrieval Text-based intelligent systems. pp. 215- 225 ,(1992)
Forbes J. Burkowski, Charles L. A. Clarke, Gordon V. Cormack, Shortest substring ranking (MultiText experiments for TREC-4) text retrieval conference. pp. 295- 304 ,(1995)
Yoëlle S. Maarek, Alan J. Wecker, The librarian's assistant: automatically organizing on-line books into dynamic bookshelves multimedia information retrieval. pp. 233- 247 ,(1994)
Jay M. Ponte, W. Bruce Croft, Text Segmentation by Topic european conference on research and advanced technology for digital libraries. pp. 113- 125 ,(1997) , 10.1007/BFB0026725