摘要: The representation of documents and queries as vectors in a high-dimensional space is well-established information retrieval. author proposes that the semantics words contexts text be represented vectors. dimensions are initial determined by occurring close to entity represented, which implies has several thousand (words). This makes vector representations (which dense) too cumbersome use directly. Therefore, dimensionality reduction means singular value decomposition employed. analyzes structure applies them word sense disambiguation thesaurus induction. >