System and method for context-based document retrieval

作者: William Russell Softky , Shermann Loyall Min , Constantin Lorenzo Tanno , Zachary Frank Mainen

DOI:

关键词:

摘要: A system and method for document retrieval is disclosed. The invention addresses a major problem in text-based retrieval: rapidly finding small subset of documents large collection (e.g. Web pages on the Internet) that are relevant to limited set query terms supplied by user. based utilizing information contained about statistics word relationships (“context”) facilitate specification search queries comparison. consists first compiling into context database captures proximity occurrence throughout collection. At time, matrix computed from user-supplied keywords database. For each collection, similar using contents Document relevance determined comparing similarity matrices. disclosed therefore retrieves with contextual rather than frequency similarity, simplifying while allowing greater precision.

参考文章(10)
Alan F. Smeaton, Mark E. Burnett, Francis Crimmins, Gerard Quinn, An architecture for efficient document clustering and retrieval on a dynamic collection of newspaper texts IRSG'98 Proceedings of the 20th Annual BCS-IRSG conference on Information Retrieval Research. pp. 10- 10 ,(1998) , 10.14236/EWIC/IRSG1998.10
Franz Weckesser, Thomas Pease, Richard G. Graham, Dale Waddell, Ray Daley, Catherine Leininger, John Holt, Darin W. Mcbeath, Minh Doan, David James Miller, Allan X. Lu, Stephen M. Sever, Associative text search and retrieval system ,(1994)
Jeffrey Esakov, Suz Hsi Wan, Liviu Chiriac, Dina Kravets, Search data processor ,(1998)
David B. Johnson, Thomas Hampp-Bahamueller, Text categorization toolkit ,(1998)