Semantic textual similarity using maximal weighted bipartite graph matching

作者: Harish Karnick , Sumit Bhagwani , Shrutiranjan Satapathy

DOI:

关键词:

摘要: The paper aims to come up with a system that examines the degree of semantic equivalence between two sentences. At core is attempt grade similarity sentences by finding maximal weighted bipartite match tokens include single words, or multi-words in case Named Entitites, adjectivally and numerically modified words. Two token measures are used for task - WordNet based similarity, statistical word measure which overcomes shortcomings similarity. As part three systems created task, we explore simple bag words tokenization scheme, more careful scheme captures named entities, times, dates, monetary entities etc., finally try capture context around using grammatical dependencies.

参考文章(14)
Adam Kilgarriff, Joseph Rosenzweig, English Senseval: Report and Results language resources and evaluation. ,(2000)
Bill MacCartney, Marie-Catherine de Marneffe, Christopher D. Manning, Generating Typed Dependency Parses from Phrase Structure Parses language resources and evaluation. pp. 449- 454 ,(2006)
Palakorn Achananuparp, Xiaohua Hu, Xiajiong Shen, The Evaluation of Sentence Similarity Measures data warehousing and knowledge discovery. pp. 305- 316 ,(2008) , 10.1007/978-3-540-85836-2_29
Dekang Lin, An Information-Theoretic Definition of Similarity international conference on machine learning. pp. 296- 304 ,(1998)
Kristina Toutanova, Dan Klein, Christopher D. Manning, Yoram Singer, Feature-rich part-of-speech tagging with a cyclic dependency network Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03. pp. 173- 180 ,(2003) , 10.3115/1073445.1073478
Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw, Ralph Weischedel, OntoNotes Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers on XX - NAACL '06. pp. 57- 60 ,(2006) , 10.3115/1614049.1614064
Jenny Rose Finkel, Trond Grenager, Christopher Manning, Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling meeting of the association for computational linguistics. pp. 363- 370 ,(2005) , 10.3115/1219840.1219885
Dan Klein, Christopher D. Manning, Accurate unlexicalized parsing Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL '03. pp. 423- 430 ,(2003) , 10.3115/1075096.1075150
Xiao-Ying Liu, Yi-Ming Zhou, Ruo-Shi Zheng, Measuring semantic similarity within sentences international conference on machine learning and cybernetics. ,vol. 5, pp. 2558- 2562 ,(2008) , 10.1109/ICMLC.2008.4620839