The More the Better? Assessing the Influence of Wikipedia's Growth on Semantic Relatedness Measures

作者: Iryna Gurevych , Torsten Zesch

DOI:

关键词: Computer scienceNatural language processingArtificial intelligenceSemantic similarity

摘要: Wikipedia has been used as a knowledge source in many areas of natural language processing. As most studies only use certain snapshot, the influence Wikipedia’s massive growth on results is largely unknown. For first time, we perform an in-depth analysis this using semantic relatedness example application that tests wide range properties. We find almost no effect correlation measures with human judgments, while coverage steadily increases.

参考文章(25)
Iryna Gurevych, Torsten Zesch, Christof Müller, Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary language resources and evaluation. ,(2008)
Kotaro Nakayama, Takahiro Hara, Shojiro Nishio, Wikipedia mining for an association web thesaurus construction web information systems engineering. pp. 322- 334 ,(2007) , 10.1007/978-3-540-76993-4_27
Satanjeev Banerjee, Ted Pedersen, An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet international conference on computational linguistics. pp. 136- 145 ,(2002) , 10.1007/3-540-45715-1_11
Gerard Salton, Michael J. McGill, Introduction to Modern Information Retrieval ,(1983)
Ted Pedersen, Siddharth Patwardhan, Using WordNet Based Context Vectors to Estimate the Semantic Relatedness of Concepts Proceedings of the Workshop on Making Sense of Sense: Bringing Psycholinguistics and Computational Linguistics Together. ,(2006)
David Milne, Ian H. Witten, An effective, low-cost measure of semantic relatedness obtained from Wikipedia links AAAI Press. pp. 25- 30 ,(2008)
Jane Morris, Graeme Hirst, Non-classical lexical semantic relations north american chapter of the association for computational linguistics. pp. 46- 51 ,(2004) , 10.3115/1596431.1596438
Adam Kilgarriff, Christiane Fellbaum, WordNet : an electronic lexical database Language. ,vol. 76, pp. 706- ,(2000) , 10.2307/417141
TORSTEN ZESCH, IRYNA GUREVYCH, Wisdom of crowds versus wisdom of linguists – measuring the semantic relatedness of words Natural Language Engineering. ,vol. 16, pp. 25- 59 ,(2010) , 10.1017/S1351324909990167