Dual-source backoff for enhancing language models

作者： Sehyeong Cho

关键词:

摘要: This paper proposes a method of combining two n-gram language models to construct single model. One the corpora is constructed from very small corpus right domain interest, and other large but less adequate corpus. based on observation that has high quality n-grams suffers sparseness problem, while another inadequately biased, easy obtain bigger size. The basic idea behind dual-source backoff basically same with Katz's backoff. We ran experiments 3-gram newspaper several millions tens words together smaller broadcast news corpora. target was news. obtained significant improvement by incorporating around one thirtieth size

uni-trier.de 本地加速

springer.com 本地加速

sci-hub.st HTML 下载加速

参考文章(6)

Katunobu Itou, Atsushi Fujii, Tetsuya Ishikawa, Tomoyosi Akiba, Selective back-off smoothing for incorporating grammatical constraints into the n-gram language model. conference of the international speech communication association. ,(2002)

Ronald Rosenfeld, Adaptive Statistical Language Modeling; A Maximum Entropy Approach ,(1994)

F. Jelinek, R. L. Mercer, L. R. Bahl, J. K. Baker, Perplexity—a measure of the difficulty of speech recognition tasks Journal of the Acoustical Society of America. ,vol. 62, ,(1977) , 10.1121/1.2016299

I. J. GOOD, THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS Biometrika. ,vol. 40, pp. 237- 264 ,(1953) , 10.1093/BIOMET/40.3-4.237

S.F. Chen, K. Seymore, R. Rosenfeld, Topic adaptation for language modeling using unnormalized exponential models international conference on acoustics speech and signal processing. ,vol. 2, pp. 681- 684 ,(1998) , 10.1109/ICASSP.1998.675356

S. Katz, Estimation of probabilities from sparse data for the language model component of a speech recognizer IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 35, pp. 400- 401 ,(1987) , 10.1109/TASSP.1987.1165125

Dual-source backoff for enhancing language models

来源期刊

我的账户

Dual-source backoff for enhancing language models

来源期刊

相似文章 0

我的账户