Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages

作者: Dawn Knight , Scott Piao , Paul Rayson , Dawn Archer , Francesca Bianchi

DOI:

关键词:

摘要: The last two decades have seen the development of various semantic lexical resources such as WordNet (Miller, 1995) and USAS lexicon (Rayson et al., 2004), which played an important role in areas natural language processing corpus-based studies. Recently, increasing efforts been devoted to extending frameworks existing knowledge cover more languages, EuroWordNet Global WordNet. In this paper, we report on construction large-scale multilingual lexicons for twelve employ unified Lancaster taxonomy provide a base automatic UCREL annotation system (USAS). Our work contributes towards goal constructing larger-scale higher-quality developing corpus tools based them. Lexical coverage is factor concerning quality performance tools, experiment focus evaluating achieved by evaluation shows that some those Finnish Italian over 90% while others need further expansion.

参考文章(12)
B. Babych, S. Piao, P. Rayson, O. Mudraya, A. Wilson, Developing a Russian semantic tagger for automatic semantic annotation ,(2006)
Nikola Ljubešić, Jörg Tiedemann, Marcos Zampieri, Merging Comparable Data Sources for the Discrimination of Similar Languages : The DSL Corpus Collection language resources and evaluation. pp. 6- 10 ,(2014)
Andrew Hardie, Amanda Potts, Ghada Mohamed, AraSAS : a semantic tagger for Arabic ,(2013)
T McEnery, P Rayson, DE Archer, SL Piao, The UCREL Semantic Analysis System European Language Resources Association. ,(2004)
S Piao, P Rayson, DE Archer, AM McEnery, Comparing the UCREL Semantic Annotation Scheme with Lexicographical Taxonomies Proceedings of the 11th EURALEX International Congress. pp. 817- 827 ,(2004)
S. Piao, P. Rayson, J.P. Juntunen, L. Löfberg, K. Varantola, A. Nykanen, A semantic tagger for the Finnish language ,(2005)
Dawn Archer, A. M. McEnery, S. Piao, P. Rayson, Evaluating Lexical Resources for a Semantic Tagger language resources and evaluation. ,(2004)
Scott Songlin Piao, Paul Rayson, Dawn Archer, Tony McEnery, Comparing and combining a semantic tagger and a statistical tool for MWE extraction Computer Speech & Language. ,vol. 19, pp. 378- 397 ,(2005) , 10.1016/J.CSL.2004.11.002
George A. Miller, WordNet Communications of the ACM. ,vol. 38, pp. 39- 41 ,(1995) , 10.1145/219717.219748
František Čermák, Alexandr Rosen, The case of InterCorp, a multilingual parallel corpus International Journal of Corpus Linguistics. ,vol. 17, pp. 411- 427 ,(2012) , 10.1075/IJCL.17.3.05CER