作者: Graham Spencer
DOI:
关键词: tf–idf 、 Software 、 Computer science 、 Lookup table 、 Measure (data warehouse) 、 Cache 、 Information retrieval 、 Traverse 、 Inverted index 、 Data mining 、 Term (time)
摘要: A system, method, and various software products provide for improved information retrieval in very large document databases through the use of a predetermined static cache. The cache includes terms that appear number documents, plurality documents ordered by contribution term makes to score document. is scalar measure influence computed score. reflects both within frequency between term. In addition, each lookup table references selected entries an inverted index. Queries database are then processed first traversing obtaining thereform computing from this information. Additional other query obtained looking up tables terms, such index, or searching caches terms.