Implementing ranking strategies using text signatures

作者: W. Bruce Croft , Pasquale Savino

DOI: 10.1145/42279.45947

关键词:

摘要: Signature files provide an efficient access method for text in documents, but retrieval is usually limited to finding documents that contain a specified Boolean pattern of words. Effective requires with similar meanings be found through process plausible inference. The simplest way implementing this rank order their probability relevance. In paper techniques are described probabilistic ranking strategies sequential and bit-sliced signature tiles the limitations these implementations regard effectiveness pointed out. A detailed comparison made between signature-based using term-based document representatives inverted files. shows representations at least competitive (in terms efficiency) and, some situations, superior.

参考文章(28)
Ron Sacks-Davis, Kotagiri Ramamohanarao, A two level superimposed coding scheme for partial match retrieval Information Systems. ,vol. 8, pp. 273- 280 ,(1983) , 10.1016/0306-4379(83)90013-3
Stone, Parallel Querying of Large Databases: A Case Study IEEE Computer. ,vol. 20, pp. 11- 21 ,(1987) , 10.1109/MC.1987.1663384
C.S. Roberts, Partial-match retrieval via the method of superimposed codes Proceedings of the IEEE. ,vol. 67, pp. 1624- 1642 ,(1979) , 10.1109/PROC.1979.11543
S. Christodoulakis, C. Faloutsos, A Multimedia Document Server IEEE Aerospace and Electronic Systems Magazine. ,vol. 1, pp. 2- 9 ,(1986) , 10.1109/MAES.1986.5004989
J. Fagan, Automatic phrase indexing for document retrieval Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '87. pp. 91- 101 ,(1987) , 10.1145/42005.42016
W. B. Croft, R. Krovetz, Interactive retrieval office documents ACM SIGOIS Bulletin. ,vol. 9, pp. 228- 235 ,(1988) , 10.1145/966861.45435
Chris Faloutsos, Stavros Christodoulakis, Signature files: an access method for documents and its analytical performance evaluation ACM Transactions on Information Systems. ,vol. 2, pp. 267- 288 ,(1984) , 10.1145/2275.357411
University of Toronto. Computer Systems Research Group, Design Considerations for a Message File Server IEEE Transactions on Software Engineering. ,vol. SE-10, pp. 201- 210 ,(1984) , 10.1109/TSE.1984.5010223
Craig Stanfill, Brewster Kahle, Parallel free-text search on the connection machine system Communications of the ACM. ,vol. 29, pp. 1229- 1239 ,(1986) , 10.1145/7902.7907
W.Bruce Croft, Thomas J Parenty, A comparison of a network structure and a database system used for document retrieval Information Systems. ,vol. 10, pp. 377- 390 ,(1985) , 10.1016/0306-4379(85)90042-0