作者: Tomáš Brychcín
DOI: 10.18653/V1/D16-1042
关键词: Incremental decision tree 、 Tree structure 、 Binary tree 、 Computer science 、 Natural language processing 、 Interval tree 、 Tree traversal 、 Fractal tree index 、 Trie 、 Tree (data structure) 、 Artificial intelligence
摘要: In this paper we introduce Latent Tree Language Model (LTLM), a novel approach to language modeling that encodes syntax and semantics of given sentence as tree word roles. The learning phase iteratively updates the trees by moving nodes according Gibbs sampling. We two algorithms infer for sentence. The first one is based on It fast, but does not guarantee find most probable tree. second dynamic programming. slower, guarantees provide comparison both algorithms. We combine LTLM with 4-gram Modified Kneser-Ney model via linear interpolation. Our experiments English Czech corpora show significant perplexity reductions (up 46% 49% Czech) compared standalone model.