Integration of acoustic information and PPRLM scores in a multiple-Gaussian classifier for Language Identification

作者: R. Cordoba , R. San-Segundo , J. Macias , Juan Montero , R. Barra

DOI: 10.1109/ODYSSEY.2006.248105

关键词:

摘要: In this paper, we present several innovative techniques that can be applied in a PPRLM system for language identification (LID). We will show how obtained 53.5% relative error reduction from our base using techniques. First, the application of variable threshold score computation, dependent on average scores model, provided 35% reduction. A random selection sentences different sets and use silence models also improved system. Then, to improve classifier, compared bias removal technique (up 19% reduction) Gaussian classifier 37% reduction). Finally, included acoustic (2% increased number Gaussians have multiple-Gaussian (14% all these improvements are remarkable as they been mostly additive.

参考文章(10)
Sridha Sridharan, Eddie Wong, Methods to improve Gaussian mixture model based language identification system. conference of the international speech communication association. ,(2002)
Qin Jin, Alex Waibel, Tanja Schultz, Phonetic Speaker Identification conference of the international speech communication association. ,(2002)
Javier Macías Guarasa, Javier Ferreiros, Juan Manuel Montero, Ricardo de Córdoba, G. Prime, José Manuel Pardo, PPRLM optimization for language identification in air traffic control tasks. conference of the international speech communication association. ,(2003)
Cuntai Guan, Bin Ma, Haizhou Li, Chin-Hui Lee, Multilingual speech recognition with language identification. conference of the international speech communication association. ,(2002)
Javier Macías Guarasa, Luis Fernando D'Haro, Javier Ferreiros, Fernando Fernández-Martínez, Valentín Sama, Ricardo de Córdoba, Language identification techniques based on full recognition in an air traffic control task. conference of the international speech communication association. ,(2004)
Pedro A. Torres-Carrasquillo, Douglas A. Reynolds, J.R. Deller, Language identification using Gaussian mixture model tokenization IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 1, pp. 757- 760 ,(2002) , 10.1109/ICASSP.2002.5743828
Jean-Luc Gauvain, Abdelkhalek Messaoudi, Holger Schwenk, LANGUAGE RECOGNITION USING PHONE LATTICES conference of the international speech communication association. ,(2004)
J. Navratil, Spoken language recognition-a step toward multilinguality in speech processing IEEE Transactions on Speech and Audio Processing. ,vol. 9, pp. 678- 685 ,(2001) , 10.1109/89.943345
Pedro A. Torres-Carrasquillo, Pedro A. Torres-Carrasquillo, Douglas A. Reynolds, Mary A. Kohler, Elliot Singer, Richard J. Greene, John R. Deller, Approaches to Language Identification using Gaussian Mixture Models and Shifted Delta Cepstral Features conference of the international speech communication association. ,(2002)
M.A. Zissman, Comparison of four approaches to automatic language identification of telephone speech IEEE Transactions on Speech and Audio Processing. ,vol. 4, pp. 31- 44 ,(1996) , 10.1109/TSA.1996.481450