Stem-based maximum entropy language models for inflectional languages.

作者： Vassilios Digalakis , Dimitris Oikonomidis

DOI:

关键词:

摘要: In this work we build language models using three different training methods: n-gram, class-based and maximum entropy models. The main issue is the use of stem information to cope with very large number distinct words an inflectional language, like Greek. We compare both perplexity word error rate. also examine thoroughly differences on specific subsets words.

uni-trier.de 本地加速

isca-speech.org 本地加速

tuc.gr 本地加速

暂无可下载资源，当前可以选择系统获取到有开放资源时通知我或者直接发起求助文献求助

参考文章(9)

Vassilios Digalakis, Vassilios Diakoloukas, Nikos Tsourakis, Dimitris Pratsolis, Dimitris Oikonomidis, Christos Vosnidis, Nikos Chatzichrisafis, Large vocabulary continuous speech recognition in greek: corpus and an automatic dictation system. conference of the international speech communication association. ,(2003)

Jun Wu, Sanjeev Khudanpur, Building a topic-dependent maximum entropy model for very large corpora IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 1, pp. 777- 780 ,(2002) , 10.1109/ICASSP.2002.5743833

Hermann Ney, Ute Essen, Reinhard Kneser, On structuring probabilistic dependences in stochastic language modelling Computer Speech & Language. ,vol. 8, pp. 1- 38 ,(1994) , 10.1006/CSLA.1994.1001

I. J. GOOD, THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS Biometrika. ,vol. 40, pp. 237- 264 ,(1953) , 10.1093/BIOMET/40.3-4.237

Vincent J. Della Pietra, Adam L. Berger, Stephen A. Della Pietra, A maximum entropy approach to natural language processing Computational Linguistics. ,vol. 22, pp. 39- 71 ,(1996) , 10.5555/234285.234289

I.H. Witten, T.C. Bell, The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression IEEE Transactions on Information Theory. ,vol. 37, pp. 1085- 1094 ,(1991) , 10.1109/18.87000

D.J. Kershaw, L. Lamel, D.A. Leeuwen, D. Pye, A.J. Robinson, H.J.M. Steeneken, P.C. Woodland, S.J. Young, M. Adda-Dekker, X. Aubert, C. Dugast, J.L. Gauvain, Multilingual large vocabulary speech recognition: the European SQALE project Computer Speech & Language. ,vol. 11, pp. 73- 89 ,(1997) , 10.1006/CSLA.1996.0023

S. Katz, Estimation of probabilities from sparse data for the language model component of a speech recognizer IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 35, pp. 400- 401 ,(1987) , 10.1109/TASSP.1987.1165125

Joshua T. Goodman, Stanley F. Chen, An Empirical Study of Smoothing Techniques for Language Modeling arXiv: Computation and Language. ,(1996)

Stem-based maximum entropy language models for inflectional languages.

来源期刊

我的账户

Stem-based maximum entropy language models for inflectional languages.

来源期刊

相似文章 6

Large vocabulary continuous speech recognition in greek: corpus and an automatic dictation system.

Advances in Large Vocabulary Continuous Speech Recognition in Greek: Modeling and nonlinear features

Morphologically Motivated Language Models in Speech Recognition

Development of a Modern Greek Broadcast-News Corpus and Speech Recognition System

Latent semantics in language models

Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSR

我的账户