Inducing Head-Driven PCFGs with Latent Heads: Refining a Tree-Bank Grammar for Parsing

作者: Detlef Prescher

DOI: 10.1007/11564096_30

关键词:

摘要: Although state-of-the-art parsers for natural language are lexicalized, it was recently shown that an accurate unlexicalized parser the Penn tree-bank can be simply read off a manually refined tree-bank. While lexicalized often suffer from sparse data, manual mark-up is costly and largely based on individual linguistic intuition. Thus, across domains, languages, annotations, fundamental question arises: Is possible to automatically induce without resorting full lexicalization? In this paper, we show how probabilistic with latent head information simple principles. Our has performance of 85.1% (LP/LR F1), which as good early ones. This remarkable since induction grammars in general hard task.

参考文章(22)
Roger K. Moore, Computer Speech and Language Elsevier Publishing Company. ,(1986)
Zoubin Ghahramani, Michael Jordan, None, Factorial Hidden Markov Models neural information processing systems. ,vol. 29, pp. 472- 478 ,(1995) , 10.1023/A:1007425814087
Mark Johnson, PCFG models of linguistic tree representations Computational Linguistics. ,vol. 24, pp. 613- 632 ,(1998)
Mats Rooth, Glenn Carroll, Valence Induction with a Head-Lexicalized PCFG empirical methods in natural language processing. pp. 36- 45 ,(1998)
Mitch Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, None, Building a large annotated corpus of English: the penn treebank Computational Linguistics. ,vol. 19, pp. 313- 330 ,(1993) , 10.21236/ADA273556
Eugene Charniak, Parsing with Context-Free Grammars and Word Statistics Brown University. ,(1995)
Sisay Fissaha, Daniel Olejnik, Ralf Kornberger, Karin Müller, Detlef Prescher, Experiments in German Treebank Parsing text speech and dialogue. pp. 50- 57 ,(2003) , 10.1007/978-3-540-39398-6_8
K. Lari, S.J. Young, Applications of stochastic context-free grammars using the Inside-Outside algorithm Computer Speech & Language. ,vol. 5, pp. 237- 257 ,(1991) , 10.1016/0885-2308(91)90009-F
Daniel M. Bikel, Intricacies of Collins' Parsing Model Computational Linguistics. ,vol. 30, pp. 479- 511 ,(2004) , 10.1162/0891201042544929