Evaluation of spoken language recognition technology using broadcast speech: performance and challenges.

作者： Mikel Peñagarikano , Luis Javier Rodríguez-Fuentes , Germán Bordel , Mireia Díez , Amparo Varona

DOI:

关键词:

摘要: Spoken Language Recognition (SLR) technology has remarkably improved in the last years, partly thanks to NIST Evaluations (LRE), which have become standard benchmarks for testing new approaches. evaluations focus on narrow-band conversational telephone speech and deal with some specific target languages. Recent efforts expand scope of SLR assessment include Albayzin 2008 2010 LRE, wide-band TV broadcast speech. In this work, a system based state-of-the-art approaches is developed evaluated LRE datasets, looking identify those conditions that make task challenging eventually guide design future using same kind data. We present analyse performance under different conditions, regarding: (1) set languages (including details about confusion each other) amount data available estimate models; (3) presence background noise.

uni-trier.de 本地加速

isca-speech.org 本地加速

ehu.es PDF 下载加速

researchgate.net PDF 下载加速

researchgate.net LINK 下载加速

参考文章(18)

Mikel Penagarikano, Luis Javier Rodriguez-Fuentes, German Bordel, Mireia Diez, Amparo Varona, University of the Basque Country (EHU) Systems for the 2011 NIST Language Recognition Evaluation ,(2011)

Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel, Mireia Díez, Amparo Varona, KALAKA: A TV Broadcast Speech Database for the Evaluation of Language Recognition Systems. language resources and evaluation. ,(2010)

Valiantsina Hubeika, Albert Strasheim, Niko Brümmer, Ondrej Glembek, Lukás Burget, Pavel Matejka, Discriminative Acoustic Language Recognition via Channel-Compensated GMM Statistics conference of the international speech communication association. pp. 2187- 2190 ,(2009)

Mark A. Przybocki, Alvin F. Martin, NIST 2003 Language Recognition Evaluation conference of the international speech communication association. ,(2003)

Mark Ordowski, Mark A. Przybocki, Alvin F. Martin, George R. Doddington, Terri Kamm, The DET Curve in Assessment of Detection Task Performance conference of the international speech communication association. ,(1997)

Andreas Stolcke, SRILM – An Extensible Language Modeling Toolkit conference of the international speech communication association. ,(2002)

Mikel Peñagarikano, Luis Javier Rodríguez, Germán Bordel, Mireia Díez, Amparo Varona, The Albayzin 2010 Language Recognition Evaluation conference of the international speech communication association. pp. 1529- 1532 ,(2011)

Luis Javier Rodriguez-Fuentes, Mikel Penagarikano, Amparo Varona, Mireia Diez, German Bordel, David Martinez, Jesus Villalba, Antonio Miguel, Alfonso Ortega, Eduardo Lleida, Alberto Abad, Oscar Koller, Isabel Trancoso, Paula Lopez-Otero, Laura Docio-Fernandez, Carmen Garcia-Mateo, Rahim Saeidi, Mehdi Soufifar, Tomi Kinnunen, Torbjorn Svendsen, Pasi Franti, Multi-site heterogeneous system fusions for the Albayzin 2010 Language Recognition Evaluation ieee automatic speech recognition and understanding workshop. pp. 377- 382 ,(2011) , 10.1109/ASRU.2011.6163961

Niko Brummer, David Van Leeuwen, On calibration of language recognition scores 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop. pp. 1- 8 ,(2006) , 10.1109/ODYSSEY.2006.248106

10.

F. S. Richardson, W. M. Campbell, Language recognition with discriminative keyword selection international conference on acoustics, speech, and signal processing. pp. 4145- 4148 ,(2008) , 10.1109/ICASSP.2008.4518567

Evaluation of spoken language recognition technology using broadcast speech: performance and challenges.

来源期刊

我的账户

Evaluation of spoken language recognition technology using broadcast speech: performance and challenges.

来源期刊

相似文章 6

Improving speaker identification robustness to highly channel-degraded speech through multiple system fusion

Steganalysis of transcoding steganography

KALAKA-3: a database for the assessment of spoken language recognition technology on YouTube audios

KALAKA-3: a database for the recognition of spoken European languages on YouTube audios

The Albayzin 2012 Language Recognition Evaluation

Steganalysis of Transcoding Steganography

我的账户