Inverted files for text search engines

作者: Justin Zobel , Alistair Moffat

DOI: 10.1145/1132956.1132959

关键词: Search engineInverted indexIndex (publishing)Key (cryptography)Search engine indexingInformation retrievalFull text searchWeb search engineComputer scienceWorld Wide WebInformation system

摘要: The technology underlying text search engines has advanced dramatically in the past decade. development of a family new index representations led to wide range innovations storage, construction, and query evaluation. While some these developments have been consolidated textbooks, many specific techniques are not widely known or textbook descriptions out date. In this tutorial, we introduce key area, describing both core implementation how can be enhanced through extensions. We conclude with comprehensive bibliography indexing literature.

参考文章(204)
E.J. Schuegraf, Compression of large inverted files with hyperbolic term distribution Information Processing and Management. ,vol. 12, pp. 377- 384 ,(1976) , 10.1016/0306-4573(76)90035-2
Anthony Tomasic, Hector Garcia-Molina, Performance issues in distributed shared-nothing information-retrieval systems Information Processing and Management. ,vol. 32, pp. 647- 665 ,(1996) , 10.1016/S0306-4573(96)00019-2
P. Zezula, F. Rabitti, P. Tiberio, Dynamic partitioning of signature files ACM Transactions on Information Systems. ,vol. 9, pp. 336- 367 ,(1991) , 10.1145/119311.119313
Wai Yee Peter Wong, Dik Lun Lee, Implementations of partial document ranking using inverted files Information Processing and Management. ,vol. 29, pp. 647- 669 ,(1993) , 10.1016/0306-4573(93)90085-R
Dik Lun Lee, Chun-Wu Leng, Partitioned signature files: design issues and performance evaluation ACM Transactions on Information Systems. ,vol. 7, pp. 158- 180 ,(1989) , 10.1145/65935.65937
David Hawking, Efficiency/effectiveness trade-offs in query processing (from theory into practice workshop, 1998 SIGIR conf.) international acm sigir conference on research and development in information retrieval. ,vol. 32, pp. 16- 22 ,(1998) , 10.1145/305110.305119
C. Stanfill, R. Thau, D. Waltz, A parallel indexed algorithm for information retrieval international acm sigir conference on research and development in information retrieval. ,vol. 23, pp. 88- 97 ,(1989) , 10.1145/75334.75345
Howard Turtle, James Flood, Query evaluation: strategies and optimizations Information Processing and Management. ,vol. 31, pp. 831- 850 ,(1995) , 10.1016/0306-4573(95)00020-H
Ellen M. Voorhees, The efficiency of inverted index and cluster searches international acm sigir conference on research and development in information retrieval. pp. 164- 174 ,(1986) , 10.1145/253168.253203
Berthier A. Ribeiro-Neto, Ramurti A. Barbosa, Query performance for tightly coupled distributed digital libraries acm international conference on digital libraries. pp. 182- 190 ,(1998) , 10.1145/276675.276695