作者: W. Bruce Croft , Pasquale Savino
DOI: 10.1145/42279.45947
关键词:
摘要: Signature files provide an efficient access method for text in documents, but retrieval is usually limited to finding documents that contain a specified Boolean pattern of words. Effective requires with similar meanings be found through process plausible inference. The simplest way implementing this rank order their probability relevance. In paper techniques are described probabilistic ranking strategies sequential and bit-sliced signature tiles the limitations these implementations regard effectiveness pointed out. A detailed comparison made between signature-based using term-based document representatives inverted files. shows representations at least competitive (in terms efficiency) and, some situations, superior.