Latent Tree Language Model

关键词: Incremental decision tree 、 Tree structure 、 Binary tree 、 Computer science 、 Natural language processing 、 Interval tree 、 Tree traversal 、 Fractal tree index 、 Trie 、 Tree (data structure) 、 Artificial intelligence

摘要: In this paper we introduce Latent Tree Language Model (LTLM), a novel approach to language modeling that encodes syntax and semantics of given sentence as tree word roles. The learning phase iteratively updates the trees by moving nodes according Gibbs sampling. We two algorithms infer for sentence. The first one is based on It fast, but does not guarantee find most probable tree. second dynamic programming. slower, guarantees provide comparison both algorithms. We combine LTLM with 4-gram Modified Kneser-Ney model via linear interpolation. Our experiments English Czech corpora show significant perplexity reductions (up 46% 49% Czech) compared standalone model.

uni-trier.de 本地加速

参考文章(22)

Tomas Mikolov, Martin Karafiát, Sanjeev Khudanpur, Jan Cernocký, Lukás Burget, Recurrent neural network based language model conference of the international speech communication association. pp. 1045- 1048 ,(2010)

Thomas P. Minka, Estimating a Dirichlet Distribution ,(2000)

Martin Popel, David Mareček, Perplexity of n-Gram and Dependency Language Models Text, Speech and Dialogue. pp. 173- 180 ,(2010) , 10.1007/978-3-642-15760-8_23

Hiyan Alshawi, Valentin I. Spitkovsky, Daniel Jurafsky, Christopher D. Manning, Viterbi Training Improves Unsupervised Dependency Parsing conference on computational natural language learning. pp. 9- 17 ,(2010)

Koen Deschacht, Jan De Belder, Marie-Francine Moens, The latent words language model Computer Speech & Language. ,vol. 26, pp. 384- 409 ,(2012) , 10.1016/J.CSL.2012.04.001

Tomáš Brychcín, Miloslav Konopík, Latent semantics in language models Computer Speech & Language. ,vol. 33, pp. 88- 108 ,(2015) , 10.1016/J.CSL.2015.01.004

A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum Likelihood from Incomplete Data Via theEMAlgorithm Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 39, pp. 1- 22 ,(1977) , 10.1111/J.2517-6161.1977.TB01600.X

E.W.D Whittaker, P.C Woodland, Language modelling for Russian and English using words and classes Computer Speech & Language. ,vol. 17, pp. 87- 104 ,(2003) , 10.1016/S0885-2308(02)00047-5

Tomáš Brychcín, Miloslav Konopík, Semantic spaces for improving language modeling Computer Speech & Language. ,vol. 28, pp. 192- 209 ,(2014) , 10.1016/J.CSL.2013.05.001

10.

Sven Martin, Jörg Liermann, Hermann Ney, Algorithms for bigram and trigram word clustering Speech Communication. ,vol. 24, pp. 19- 37 ,(1998) , 10.1016/S0167-6393(97)00062-9

Latent Tree Language Model

来源期刊

我的账户

Latent Tree Language Model

来源期刊

相似文章 1

Argumentation mining: How can a machine acquire common sense and world knowledge?

我的账户