作者: Daniel Osuna-Ontiveros , Ivan Lopez-Arevalo , Victor Sosa-Sosa
DOI: 10.1109/ICEEE.2011.6106659
关键词:
摘要: Nowadays, users of computers store a lot text documents. This requires fast and precise searches over The goal Information Retrieval (IR) models is to provide with those documents that will satisfy their information needs. core such the document representation used in indexing Traditional IR handle frequency query terms. disadvantage these they exclusively consider terms ignore similar paper proposes topic based approach represent topics associated Documents are modeled by using clustering algorithms on natural language processing. As result this proposal document-topic matrix denoting importance inside In way, each converted into vector topics. Thus, similarity measure can be applied retrieve most relevant