N-Gram Model Smoothing with Independently Controllable Parameters

作者: Robert Carter Moore

DOI:

关键词:

摘要: Described is a technology by which probability estimated for token in sequence of tokens based upon number zero or more times (actual counts) that the was observed training data. The may be word sequence, and used statistical language model. A discount parameter set independently interpolation parameters. If at least once data, an are computed summed to provide probability. not observed, computing backoff Also described various ways obtain

参考文章(30)
Jianfeng Gao, Milind Mahajan, Patrick Nguyen, MSRLM: a Scalable Language Modeling Toolkit pp. 19- ,(2007)
Bo-June Paul Hsu, James R. Glass, Iterative language model estimation: efficient data structure & algorithms. conference of the international speech communication association. pp. 841- 844 ,(2008)
Jesper Olsen, Language model compression ,(2005)
Simon Corston-Oliver, Michael Gamon, Robert C. Moore, Zhu Zhang, Eric Ringger, Sentence realization model for a natural language generation system ,(2002)
Benjamin William Medlock, Jonathan Paul Reynolds, System and Method for Inputting Text into Small Screen Devices ,(2012)
Jianfeng Gao, Joshua T. Goodman, Cluster and pruning-based language model compression ,(2000)
Stanley F Chen, Ronald Rosenfeld, A Gaussian Prior for Smoothing Maximum Entropy Models Defense Technical Information Center. ,(1999) , 10.21236/ADA360974