Learning Accurate, Compact, and Interpretable Tree Annotation

作者: Slav Petrov , Leon Barrett , Romain Thibaux , Dan Klein

DOI: 10.3115/1220175.1220230

关键词:

摘要: We present an automatic approach to tree annotation in which basic nonterminal symbols are alternately split and merged maximize the likelihood of a training treebank. Starting with simple X-bar grammar, we learn new grammar whose nonterminals subsymbols original nonterminals. In contrast previous work, able various terminals different degrees, as appropriate actual complexity data. Our grammars automatically kinds linguistic distinctions exhibited work on manual annotation. On other hand, our much more compact substantially accurate than Despite its simplicity, best achieves F1 90.2% Penn Treebank, higher fully lexicalized systems.

参考文章(17)
Dan Klein, Christopher D. Manning, Accurate unlexicalized parsing Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL '03. pp. 423- 430 ,(2003) , 10.3115/1075096.1075150
James Henderson, Discriminative Training of a Neural Network Statistical Parser meeting of the association for computational linguistics. pp. 95- 102 ,(2004) , 10.3115/1218955.1218968
Eugene Charniak, Mark Johnson, Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking meeting of the association for computational linguistics. pp. 173- 180 ,(2005) , 10.3115/1219840.1219862
Hinrich Schütze, Automatic word sense discrimination Computational Linguistics. ,vol. 24, pp. 97- 123 ,(1998)
Takuya Matsuzaki, Yusuke Miyao, Jun'ichi Tsujii, Probabilistic CFG with Latent Annotations meeting of the association for computational linguistics. pp. 75- 82 ,(2005) , 10.3115/1219840.1219850
Eugene Charniak, Tree-bank Grammars national conference on artificial intelligence. pp. 1031- 1036 ,(1996)
Joshua Goodman, Parsing algorithms and metrics Proceedings of the 34th annual meeting on Association for Computational Linguistics -. pp. 177- 183 ,(1996) , 10.3115/981863.981887
Noam Chomsky, Aspects of the Theory of Syntax ,(1965)
Fernando Pereira, Yves Schabes, Inside-outside reestimation from partially bracketed corpora Proceedings of the 30th annual meeting on Association for Computational Linguistics -. pp. 128- 135 ,(1992) , 10.3115/981967.981984