Scoring, term weighting, and the vector space model

作者: Christopher D. Manning , Prabhakar Raghavan , Hinrich Schutze

DOI: 10.1017/CBO9780511809071.007

关键词:

摘要: Thus far, we have dealt with indexes that support Boolean queries: A document either matches or does not match a query. In the case of large collections, resulting number matching documents can far exceed human user could possibly sift through. Accordingly, it is essential for search engine to rank-order To do this, computes, each document, score respect query at hand. this chapter, initiate study assigning (query, document) pair. This chapter consists three main ideas. We introduce parametric and zone in Section 6.1, which serve two purposes. First, they allow us index retrieve by metadata, such as language written. Second, give simple means scoring (and thereby ranking) response Next, 6.2 develop idea weighting importance term based on statistics occurrence term. 6.3, show viewing vector weights, compute between document. view known space scoring. 6.4 develops several variants term-weighting model. Chapter 7 computational aspects related topics.

参考文章(0)