Efficient large-context dependency parsing and correction with distributional lexical resources

作者: Enrique Henestroza Anguiano

DOI:

关键词:

摘要: This thesis explores ways to improve the accuracy and coverage of efficient statistical dependency parsing. We employ transition-based parsing with models learned using Support Vector Machines (Cortes Vapnik, 1995), our experiments are carried out on French. Transition-based is very fast due computational efficiency its underlying algorithms, which based a local optimization attachment decisions. Our first research thread thus increase syntactic context used. From arc-eager transition system (Nivre, 2008) we propose variant that simultaneously considers multiple candidate governors for right-directed attachments. also test parse correction, inspired by Hall Novak (2005), revises each in considering alternative neighborhood. find multiple-candidate approaches slightly overall as well prepositional phrase coordination, two linguistic phenomena exhibit high ambiguity. second semi-supervised improving coverage. self-training within journalistic domain adaptation medical domain, two-stage approach McClosky et al. (2006). then turn lexical modeling over large corpus: model generalized classes reduce data sparseness, preference disambiguation. can sometimes coverage, without increasing time complexity.

参考文章(154)
Pablo Gamallo, Alexandre Agustini, Gabriel P Lopes, None, Clustering Syntactic Positions with Similar Semantic Requirements Computational Linguistics. ,vol. 31, pp. 107- 146 ,(2005) , 10.1162/0891201053630318
Ido Dagan, Shaul Marcus, Shaul Markovitch, Contextual word similarity and estimation from sparse data Proceedings of the 31st annual meeting on Association for Computational Linguistics -. pp. 164- 171 ,(1993) , 10.3115/981574.981596
Vincent J. Della Pietra, Adam L. Berger, Stephen A. Della Pietra, A maximum entropy approach to natural language processing Computational Linguistics. ,vol. 22, pp. 39- 71 ,(1996) , 10.5555/234285.234289
Katrin Erk, A Simple, Similarity-based Model for Selectional Preferences meeting of the association for computational linguistics. pp. 216- 223 ,(2007)
Dan Klein, Christopher D. Manning, Accurate unlexicalized parsing Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL '03. pp. 423- 430 ,(2003) , 10.3115/1075096.1075150
Hal Daumé Iii, Avishek Saha, Abhishek Kumar, Frustratingly Easy Semi-Supervised Domain Adaptation Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing. pp. 53- 59 ,(2010)
Abhishek Arun, Frank Keller, Lexicalization in Crosslinguistic Probabilistic Parsing: The Case of French meeting of the association for computational linguistics. pp. 306- 313 ,(2005) , 10.3115/1219840.1219878
Jay J Jiang, David W Conrath, None, Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy Proceedings of the 10th Research on Computational Linguistics International Conference. pp. 19- 33 ,(1997)
KAREL VAN DEN EYNDE, PIET MERTENS, La valence: l'approche pronominale et son application au lexique verbal Journal of French Language Studies. ,vol. 13, pp. 63- 104 ,(2003) , 10.1017/S0959269503001005
Ido Dagan, Lillian Lee, Fernando Pereira, Similarity-based methods for word sense disambiguation Proceedings of the 35th annual meeting on Association for Computational Linguistics -. pp. 56- 63 ,(1997) , 10.3115/976909.979625