作者: Harish Karnick , Sumit Bhagwani , Shrutiranjan Satapathy
DOI:
关键词:
摘要: The paper aims to come up with a system that examines the degree of semantic equivalence between two sentences. At core is attempt grade similarity sentences by finding maximal weighted bipartite match tokens include single words, or multi-words in case Named Entitites, adjectivally and numerically modified words. Two token measures are used for task - WordNet based similarity, statistical word measure which overcomes shortcomings similarity. As part three systems created task, we explore simple bag words tokenization scheme, more careful scheme captures named entities, times, dates, monetary entities etc., finally try capture context around using grammatical dependencies.