作者: Marcin Kaszkiel , Justin Zobel
DOI: 10.1002/1532-2890(2000)9999:9999<::AID-ASI1075>3.3.CO;2-R
关键词:
摘要: Text retrieval systems store a great variety of documents, from abstracts, newspaper articles, and Web pages to journal books, court transcripts, legislation. Collections diverse types documents expose shortcomings in current approaches ranking. Use short fragments called passages, instead whole can overcome these shortcomings: passage ranking provides convenient units text return the user, avoid difficulties comparing different length, enables identification blocks relevant material among otherwise irrelevant text. In this article, we compare several kinds an extensive series experiments. We introduce new type passage, overlapping either fixed or variable length. show that with arbitrary passages gives substantial improvements effectiveness over traditional document schemes, particularly for queries on collections long documents. Ranking shows consistent compared previous depend structure topic shifts