Creating an MTT treebank of Spanish

作者: Leo Wanner , Simon Mille , Alicia Burga , Vanesa Vidal

DOI:

关键词:

摘要: We present a cost effective strategy for the creation of mid-size fine-grained dependency treebank surface- and deep-syntactic structures as defined in Meaning-Text Theory for Spanish. The starts from small seed dependency corpus, AnCora whose annotation is considerably more coarse-grained than our target annotation. show that this discrepancy can be bridged largely by automatic means, relying upon contextual information and leaving thus minimal work to annotators. This allows us develop resources with limited human effort within limited period time. also propose preliminary evaluation the actual amount annotation process requires.We requires.

参考文章(16)
The Alpino Dependency Treebank computational linguistics in the netherlands. pp. 8- 22 ,(2002) , 10.1163/9789004334038_003
Joakim Nivre, Johan Hall, Jens Nilsson, MAMBA Meets TIGER: Reconstructing a Swedish Treebank from Antiquity Proceedings from the special session on treebanks at NODALIDA 2005. pp. 119- 132 ,(2005)
Igor Mel'Cuk, Lexical functions: a tool for the description of lexical relations in a lexicon Lexical functions in lexicography and natural language processing, 1996, ISBN 90 272 3034 X, págs. 37-102. pp. 37- 102 ,(1996)
Alexander F. Gelbukh, Sulema Torres, Hiram Calvo, Transforming a Constituency Treebank into a Dependency Treebank Procesamiento Del Lenguaje Natural. ,vol. 35, pp. 145- 152 ,(2005)
Mitch Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, None, Building a large annotated corpus of English: the penn treebank Computational Linguistics. ,vol. 19, pp. 313- 330 ,(1993) , 10.21236/ADA273556
Tuomo Kakkonen, DepAnn - An Annotation Tool for Dependency Treebanks arXiv: Computation and Language. ,(2006)
Mihai Surdeanu, Richard Johansson, Adam Meyers, Lluís Màrquez, Joakim Nivre, The CoNLL 2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies conference on computational natural language learning. pp. 159- 177 ,(2008) , 10.3115/1596324.1596352
Leo Wanner, Bernd Bohnet, Mark Giereth, Vanesa Vidal, The first steps towards the automatic compilation of specialized collocation dictionaries Terminology. ,vol. 11, pp. 143- 180 ,(2005) , 10.1075/TERM.11.1.07WAN
Susana Afonso, Eckhard Bick, Renato Haber, Diana Santos, None, Floresta sintá(c)tica: a treebank for Portuguese language resources and evaluation. ,(2002)
Rebecca Hwa, On minimizing training corpus for parser acquisition conference on computational natural language learning. pp. 10- ,(2001) , 10.3115/1117822.1117829