The BLZ Submission to the NIST 2011 LRE: Data Collection, System Development and Performance

作者： Mikel Penagarikano , Eduardo Lleida , Jesus Villalba , Alberto Abad , Mireia Diez

DOI:

关键词:

摘要: This paper describes the most relevant features of a collaborative multi-site submission to NIST 2011 Language Recognition Evaluation (LRE), consisting one primary and three contrastive systems, each fusing different combinations 13 state-of-the-art (acoustic phonotactic) language recognition subsystems. The collaboration focused on collecting sharing training data for those target languages which few development were provided by NIST, defining common dataset train backend fusion parameters select best fusions. Official post-key results are presented compared, revealing that greedy approach applied fusions suboptimal but very competitive performance. Several factors contributed high performance attained BLZ including availability low resource languages, reliability (consisting only audited NIST), diversity modeling approaches, datasets in systems considered fusion, effectiveness search optimal

参考文章(13)

Mikel Penagarikano, Eduardo Lleida, Jesus Villalba, Alberto Abad, Alfonso Ortega, Amparo Varona, The BLZ Systems for the 2011 NIST Language Recognition Evaluation ,(2011)

Valiantsina Hubeika, Albert Strasheim, Niko Brümmer, Ondrej Glembek, Lukás Burget, Pavel Matejka, Discriminative Acoustic Language Recognition via Channel-Compensated GMM Statistics conference of the international speech communication association. pp. 2187- 2190 ,(2009)

Patrick Kenny, Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms ,(2006)

Mikel Peñagarikano, Luis Javier Rodríguez, Germán Bordel, Mireia Díez, Amparo Varona, The Albayzin 2010 Language Recognition Evaluation conference of the international speech communication association. pp. 1529- 1532 ,(2011)

Luis Javier Rodriguez-Fuentes, Mikel Penagarikano, Amparo Varona, Mireia Diez, German Bordel, David Martinez, Jesus Villalba, Antonio Miguel, Alfonso Ortega, Eduardo Lleida, Alberto Abad, Oscar Koller, Isabel Trancoso, Paula Lopez-Otero, Laura Docio-Fernandez, Carmen Garcia-Mateo, Rahim Saeidi, Mehdi Soufifar, Tomi Kinnunen, Torbjorn Svendsen, Pasi Franti, Multi-site heterogeneous system fusions for the Albayzin 2010 Language Recognition Evaluation ieee automatic speech recognition and understanding workshop. pp. 377- 382 ,(2011) , 10.1109/ASRU.2011.6163961

Niko Brummer, David Van Leeuwen, On calibration of language recognition scores 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop. pp. 1- 8 ,(2006) , 10.1109/ODYSSEY.2006.248106

Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Chih-Jen Lin, Xiang-Rui Wang, LIBLINEAR: A Library for Large Linear Classification Journal of Machine Learning Research. ,vol. 9, pp. 1871- 1874 ,(2008)

P. Kenny, P. Ouellet, N. Dehak, V. Gupta, P. Dumouchel, A Study of Interspeaker Variability in Speaker Verification IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 16, pp. 980- 988 ,(2008) , 10.1109/TASL.2008.925147

W.M. Campbell, D.E Sturim, D.A. Reynolds, Support vector machines using GMM supervectors for speaker verification IEEE Signal Processing Letters. ,vol. 13, pp. 308- 311 ,(2006) , 10.1109/LSP.2006.870086

10.

Ondrej Glembek, Lukas Burget, Najim Dehak, Niko Brummer, Patrick Kenny, Comparison of scoring methods used in speaker recognition with Joint Factor Analysis international conference on acoustics, speech, and signal processing. pp. 4057- 4060 ,(2009) , 10.1109/ICASSP.2009.4960519

The BLZ Submission to the NIST 2011 LRE: Data Collection, System Development and Performance

来源期刊

我的账户

The BLZ Submission to the NIST 2011 LRE: Data Collection, System Development and Performance

来源期刊

相似文章 4

Homogenous ensemble phonotactic language recognition based on SVM supervector reconstruction

Exploiting magnitude and phase spectral information for converted speech detection

On the complementarity of short-time fourier analysis windows of different lengths for improved language recognition.

New Insight into the Use of Phone Log-Likelihood Ratios as Features for Language Recognition

我的账户