作者: William Russell Softky , Shermann Loyall Min , Constantin Lorenzo Tanno , Zachary Frank Mainen
DOI:
关键词:
摘要: A system and method for document retrieval is disclosed. The invention addresses a major problem in text-based retrieval: rapidly finding small subset of documents large collection (e.g. Web pages on the Internet) that are relevant to limited set query terms supplied by user. based utilizing information contained about statistics word relationships (“context”) facilitate specification search queries comparison. consists first compiling into context database captures proximity occurrence throughout collection. At time, matrix computed from user-supplied keywords database. For each collection, similar using contents Document relevance determined comparing similarity matrices. disclosed therefore retrieves with contextual rather than frequency similarity, simplifying while allowing greater precision.