Text normalization and speech recognition in French.

作者: Martine Adda-Decker , Gilles Adda , Lori Lamel , Jean-Luc Gauvain

DOI:

关键词:

摘要: In this paper we present a quantitative investigation into the impact of text normalization on lexica and language models for speech recognition in French. The process defines what is considered to be word by system. Depending definition can measure different lexical coverages model perplexities, both which are closely related accuracies obtained read newspaper texts. Different normalizations up 185M words texts presented along with corresponding coverage perplexity measures. Some were found necessary achieve good coverage, while others more or less equivalent regard. choice create use experiments was based these findings. Our best system configuration 11.2% error rate AUPELF ‘French-speaking’ recognizer evaluation test held February 1997.

参考文章(11)
Martine Adda-Decker, Lori Lamel, Jean-Luc Gauvain, Issues in Large Vocabulary, Multilingual Speech Recognition. conference of the international speech communication association. ,(1995)
Maxine Eskénazi, Jean-Luc Gauvain, Lori F. Larnel, BREF, a large vocabulary spoken corpus for French. conference of the international speech communication association. ,(1991)
Douglas B. Paul, Janet M. Baker, The design for the wall street journal-based CSR corpus Proceedings of the workshop on Speech and Natural Language - HLT '91. pp. 357- 362 ,(1992) , 10.3115/1075527.1075614
J.L. Gauvain, L.F. Lamel, G. Adda, M. Adda-Decker, Speaker-independent continuous speech dictation Speech Communication. ,vol. 15, pp. 21- 37 ,(1994) , 10.1016/0167-6393(94)90038-8
D.J. Kershaw, L. Lamel, D.A. Leeuwen, D. Pye, A.J. Robinson, H.J.M. Steeneken, P.C. Woodland, S.J. Young, M. Adda-Dekker, X. Aubert, C. Dugast, J.L. Gauvain, Multilingual large vocabulary speech recognition: the European SQALE project Computer Speech & Language. ,vol. 11, pp. 73- 89 ,(1997) , 10.1006/CSLA.1996.0023
J.L. GAUVAIN, L.F. LAMEL, G. ADDA, J. MARIANI, SPEECH-TO-TEXT CONVERSION IN FRENCH International Journal of Pattern Recognition and Artificial Intelligence. ,vol. 08, pp. 99- 131 ,(1994) , 10.1142/S021800149400005X
M. Adda-Decker, G. Adda, L. Lamel, J.L. Gauvain, Developments in large vocabulary, continuous speech recognition of German international conference on acoustics speech and signal processing. ,vol. 1, pp. 153- 156 ,(1996) , 10.1109/ICASSP.1996.540313
Christopher J Leggetter, Philip C Woodland, None, Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models Computer Speech & Language. ,vol. 9, pp. 171- 185 ,(1995) , 10.1006/CSLA.1995.0010
M. Jardino, Multilingual stochastic n-gram class language models international conference on acoustics speech and signal processing. ,vol. 1, pp. 161- 163 ,(1996) , 10.1109/ICASSP.1996.540315