Speaker verification over the telephone

作者: L.F. Lamel , J.L. Gauvain

DOI: 10.1016/S0167-6393(99)00075-8

关键词: Speaker recognitionContext (language use)Test dataTelephone networkMixture modelComputer scienceSpeech processingWord error rateSpeech recognitionHidden Markov model

摘要: Speaker verification has been the subject of active research for many years, yet despite these eAorts and promising results on laboratory data, speaker performance over telephone remains below that required applications. This experimental study aimed to quantify recognition out context any specific application, as a function factors more-or-less acknowledged aAect accuracy. Some issues addressed are: model (Gaussian mixture models are compared with phone-based models), influence amount content training test data performance; degradation due aging how can this be counteracted by using adaptation techniques; achievable levels text-dependent text-independent modes. These other were large corpus read spontaneous speech (over 250 hours collected from 100 target speakers 1000 imposters) in French, designed recorded purpose study. On lowest equal error rate is 1% mode when two trials allowed per attempt minimum 1.5 s trial. ” 2000 Elsevier Science B.V. All rights reserved.

参考文章(25)
Lori Lamel, Jean-Luc Gauvain, Identifying non-linguistic speech features. conference of the international speech communication association. ,(1993)
B. Prouts, Lori Lamel, Jean-Luc Gauvain, Experiments with speaker verification over the telephone. conference of the international speech communication association. ,(1995)
Maxine Eskénazi, Jean-Luc Gauvain, Lori F. Larnel, BREF, a large vocabulary spoken corpus for French. conference of the international speech communication association. ,(1991)
Sadaoki Furui, An Overview of Speaker Recognition Technology Springer, Boston, MA. pp. 31- 56 ,(1996) , 10.1007/978-1-4613-1367-0_2
M. Newman, L. Gillick, Y. Ito, D. McAllaster, B. Peskin, Speaker verification through large vocabulary continuous speech recognition international conference on spoken language processing. ,vol. 4, pp. 2419- 2422 ,(1996) , 10.1109/ICSLP.1996.607297
L. Boves, E. den Os, Speaker recognition in telecom applications Proceedings 1998 IEEE 4th Workshop Interactive Voice Technology for Telecommunications Applications. IVTTA '98 (Cat. No.98TH8376). pp. 203- 208 ,(1998) , 10.1109/IVTTA.1998.727721
John J. Godfrey, Multilingual speech databases at LDC Proceedings of the workshop on Human Language Technology - HLT '94. pp. 23- 26 ,(1994) , 10.3115/1075812.1075819
H. Gish, M. Schmidt, Text-independent speaker identification IEEE Signal Processing Magazine. ,vol. 11, pp. 18- 32 ,(1994) , 10.1109/79.317924
G.R. Doddington, Speaker recognition—Identifying people by their voices Proceedings of the IEEE. ,vol. 73, pp. 1651- 1664 ,(1985) , 10.1109/PROC.1985.13345
Lori F Lamel, Jean-Luc Gauvain, A phone-based approach to non-linguistic speech feature identification Computer Speech & Language. ,vol. 9, pp. 87- 103 ,(1995) , 10.1006/CSLA.1995.0005