Discriminative Training of a Neural Network Statistical Parser

作者: James Henderson

DOI: 10.3115/1218955.1218968

关键词:

摘要: Discriminative methods have shown significant improvements over traditional generative in many machine learning applications, but there has been difficulty extending them to natural language parsing. One problem is that much of the work on discriminative conflates changes method with parameterization problem. We show how a parser can be trained while still parameterizing according probability model. present three for training neural network estimate probabilities statistical parser, one generative, discriminative, and where model criteria discriminative. The latter outperforms previous two, achieving state-of-the-art levels performance (90.1% F-measure constituents).

参考文章(17)
Eugene Charniak, A maximum-entropy-inspired parser north american chapter of the association for computational linguistics. pp. 132- 139 ,(2000)
Christopher M. Bishop, Neural networks for pattern recognition ,(1995)
Mitch Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, None, Building a large annotated corpus of English: the penn treebank Computational Linguistics. ,vol. 19, pp. 313- 330 ,(1993) , 10.21236/ADA273556
Adwait Ratnaparkhi, A Maximum Entropy Model for Part-Of-Speech Tagging empirical methods in natural language processing. ,(1996)
Adwait Ratnaparkhi, Learning to Parse Natural Language with Maximum Entropy Models Machine Learning. ,vol. 34, pp. 151- 175 ,(1999) , 10.1023/A:1007502103375
Michael Collins, Head-Driven Statistical Models for Natural Language Parsing Computational Linguistics. ,vol. 29, pp. 589- 637 ,(2003) , 10.1162/089120103322753356
Dan Klein, Christopher D. Manning, Conditional structure versus conditional estimation in NLP models empirical methods in natural language processing. pp. 9- 16 ,(2002) , 10.3115/1118693.1118695
Michael Collins, Nigel Duffy, New ranking algorithms for parsing and tagging Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02. pp. 263- 270 ,(2001) , 10.3115/1073083.1073128
James Henderson, Inducing history representations for broad coverage statistical parsing Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL '03. pp. 24- 31 ,(2003) , 10.3115/1073445.1073459
Rens Bod, An efficient implementation of a new DOP model conference of the european chapter of the association for computational linguistics. pp. 19- 26 ,(2003) , 10.3115/1067807.1067812