作者: Enrique Henestroza Anguiano
DOI:
关键词:
摘要: This thesis explores ways to improve the accuracy and coverage of efficient statistical dependency parsing. We employ transition-based parsing with models learned using Support Vector Machines (Cortes Vapnik, 1995), our experiments are carried out on French. Transition-based is very fast due computational efficiency its underlying algorithms, which based a local optimization attachment decisions. Our first research thread thus increase syntactic context used. From arc-eager transition system (Nivre, 2008) we propose variant that simultaneously considers multiple candidate governors for right-directed attachments. also test parse correction, inspired by Hall Novak (2005), revises each in considering alternative neighborhood. find multiple-candidate approaches slightly overall as well prepositional phrase coordination, two linguistic phenomena exhibit high ambiguity. second semi-supervised improving coverage. self-training within journalistic domain adaptation medical domain, two-stage approach McClosky et al. (2006). then turn lexical modeling over large corpus: model generalized classes reduce data sparseness, preference disambiguation. can sometimes coverage, without increasing time complexity.