Learning and inference for hierarchically split PCFGs

作者: Slav Petrov , Dan Klein

DOI:

关键词: Computer scienceTreebankContext-free grammarGrammarTop-down parsingParsingS-attributed grammarRule-based machine translationParser combinatorGenerative grammarNatural language processingArtificial intelligenceParsing expression grammarL-attributed grammar

摘要: Treebank parsing can be seen as the search for an optimally refined grammar consistent with a coarse training treebank. We describe method in which minimal is hierarchically using EM to give accurate, compact grammars. The resulting grammars are extremely compared other high-performance parsers, yet parser gives best published accuracies on several languages, well generative numbers English. In addition, we associated coarse-to-fine inference scheme vastly improves time no loss test set accuracy.

参考文章(10)
Slav Petrov, Dan Klein, Improved Inference for Unlexicalized Parsing north american chapter of the association for computational linguistics. pp. 404- 411 ,(2007)
Matthew Lease, Eugene Charniak, Mark Johnson, David McClosky, None, A look at parsing and its applications national conference on artificial intelligence. pp. 1642- 1645 ,(2006)
Zhiyi Chi, Statistical properties of probabilistic context-free grammars Computational Linguistics. ,vol. 25, pp. 131- 160 ,(1999)
Anna Corazza, Giorgio Satta, Cross-Entropy and Estimation of Probabilistic Context-Free Grammars language and technology conference. pp. 335- 342 ,(2006) , 10.3115/1220835.1220878
Michael Collins, Head-Driven Statistical Models for Natural Language Parsing Computational Linguistics. ,vol. 29, pp. 589- 637 ,(2003) , 10.1162/089120103322753356
Dan Klein, Christopher D. Manning, Accurate unlexicalized parsing Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL '03. pp. 423- 430 ,(2003) , 10.3115/1075096.1075150
Eugene Charniak, Mark Johnson, Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking meeting of the association for computational linguistics. pp. 173- 180 ,(2005) , 10.3115/1219840.1219862
Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein, Learning Accurate, Compact, and Interpretable Tree Annotation meeting of the association for computational linguistics. pp. 433- 440 ,(2006) , 10.3115/1220175.1220230
Takuya Matsuzaki, Yusuke Miyao, Jun'ichi Tsujii, Probabilistic CFG with Latent Annotations meeting of the association for computational linguistics. pp. 75- 82 ,(2005) , 10.3115/1219840.1219850
Fernando Pereira, Yves Schabes, Inside-outside reestimation from partially bracketed corpora Proceedings of the 30th annual meeting on Association for Computational Linguistics -. pp. 128- 135 ,(1992) , 10.3115/981967.981984