Towards a Rich Dependency Annotation of Spanish Corpora Hacia una anotación de dependencias enriquecida de corpus españoles

作者: Leo Wanner , Simon Mille , Alicia Burga , Vanesa Vidal , Roc Boronat

DOI:

关键词:

摘要: We present a cost-effective strategy for the creation of mid-size fine-grained Spanish dependency tree bank surface-, deep-syntactic and semantic structures as defined in Meaning-Text Theory. The starts from small seed corpus, AnCora whose annotation is considerably more coarse-grained than our target annotation. show that this discrepancy can be bridged largely by automatic means. This allows us to develop resources with limited human effort within period time. also propose preliminary evaluation actual amount work process requires. Resumen: En este articulo presentamos una estrategia de bajo coste para la creacion un corpus estructuras sintacticas (tanto superficiales como profundas) y semanticas, tal son definidas en Teoria Sentido-Texto. El es tamano medio, pero muy preciso detallado. La parte pequeno dependencias, el AnCora, cuya anotacion mucho menos detallada que nuestra. Mostramos discrepancia entre ambas anotaciones se puede salvar gran medida traves medios automaticos, lo cual permite los recursos necesarios desarrollen poco tiempo con esfuerzo humano limitado. Asimismo, proponemos evaluacion preliminar cantidad trabajo requerido terminos reales proceso anotacion. Palabras clave: dependencia, sentido-texto, sintaxis superficial, profunda, espanol, base datos arboles

参考文章(15)
Alena Böhmová, Jan Hajič, Eva Hajičová, Barbora Hladká, The Prague Dependency Treebank Treebanks. pp. 103- 127 ,(2003) , 10.1007/978-94-010-0201-1_7
Jeanette K. Gundel, Universals of topic-comment structure John Benjamins Publishing Company. pp. 209- 242 ,(1988)
The Alpino Dependency Treebank computational linguistics in the netherlands. pp. 8- 22 ,(2002) , 10.1163/9789004334038_003
Igor Mel'Cuk, Lexical functions: a tool for the description of lexical relations in a lexicon Lexical functions in lexicography and natural language processing, 1996, ISBN 90 272 3034 X, págs. 37-102. pp. 37- 102 ,(1996)
Alexander F. Gelbukh, Sulema Torres, Hiram Calvo, Transforming a Constituency Treebank into a Dependency Treebank Procesamiento Del Lenguaje Natural. ,vol. 35, pp. 145- 152 ,(2005)
Mitch Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, None, Building a large annotated corpus of English: the penn treebank Computational Linguistics. ,vol. 19, pp. 313- 330 ,(1993) , 10.21236/ADA273556
Tuomo Kakkonen, DepAnn - An Annotation Tool for Dependency Treebanks arXiv: Computation and Language. ,(2006)
Mihai Surdeanu, Richard Johansson, Adam Meyers, Lluís Màrquez, Joakim Nivre, The CoNLL 2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies conference on computational natural language learning. pp. 159- 177 ,(2008) , 10.3115/1596324.1596352
Susana Afonso, Eckhard Bick, Renato Haber, Diana Santos, None, Floresta sintá(c)tica: a treebank for Portuguese language resources and evaluation. ,(2002)
Bernd Bohnet, Andreas Langjahr, Leo Wanner, A development Environment for an MTT-Based Sentence Generator international conference on natural language generation. pp. 260- 263 ,(2000) , 10.3115/1118253.1118292