SemEval-2012 Task 4: Evaluating Chinese Word Similarity

作者: Peng Jin , Yunfang Wu

DOI:

关键词: Information retrievalRank (computer programming)Task (project management)Computer scienceSimilarity (network science)Value (computer science)Gold standard (test)SemEvalTerm (time)Word (computer architecture)

摘要: This task focuses on evaluating word similarity computation in Chinese. We follow the way of Finkelstein et al. (2002) to select pairs. Then we organize twenty undergraduates who are major Chinese linguistics annotate data. Each pair is assigned a score by each annotator. rank pairs average value similar scores among annotators. data used as gold standard. Four systems participating this return their results. evaluate results standard term Kendall's tau value, and show three them have positive correlation with manually created while taus' very small.

参考文章(12)
Georgiana Dinu, Mirella Lapata, Measuring Distributional Similarity in Context empirical methods in natural language processing. pp. 1162- 1172 ,(2010)
Eytan Ruppin, Zach Solan, Ehud Rivlin, Gadi Wolfman, Evgeniy Gabrilovich, Yossi Matias, Lev Finkelstein, Placing search in context: the concept revisited. ACM Transactions on Information Systems. ,vol. 20, pp. 116- 131 ,(2002)
Herbert Rubenstein, John B. Goodenough, Contextual correlates of synonymy Communications of the ACM. ,vol. 8, pp. 627- 633 ,(1965) , 10.1145/365628.365657
Jay J Jiang, David W Conrath, None, Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy Proceedings of the 10th Research on Computational Linguistics International Conference. pp. 19- 33 ,(1997)
Lillian Lee, Measures of Distributional Similarity meeting of the association for computational linguistics. pp. 25- 32 ,(1999) , 10.3115/1034678.1034693
Evgeniy Gabrilovich, Shaul Markovitch, Computing semantic relatedness using Wikipedia-based explicit semantic analysis international joint conference on artificial intelligence. pp. 1606- 1611 ,(2007)
James R. Curran, Marc Moens, Scaling context space Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02. pp. 231- 238 ,(2001) , 10.3115/1073083.1073123
Mirella Lapata, Automatic Evaluation of Information Ordering: Kendall's Tau Computational Linguistics. ,vol. 32, pp. 471- 484 ,(2006) , 10.1162/COLI.2006.32.4.471
Alexander Budanitsky, Graeme Hirst, Evaluating WordNet-based Measures of Lexical Semantic Relatedness Computational Linguistics. ,vol. 32, pp. 13- 47 ,(2006) , 10.1162/COLI.2006.32.1.13
Dekang Lin, Automatic Retrieval and Clustering of Similar Words meeting of the association for computational linguistics. pp. 768- 774 ,(1998) , 10.3115/980691.980696