ELISA Nist RT03 Broadcast News Speaker Diarization Experiments

作者: Daniel Moraru , Sylvain Meignier , Corinne Fredouille , Laurent Besacier , Jean-François Bonastre

DOI:

关键词:

摘要: This paper presents the ELISA consortium activities in automatic speaker diarization (also known as segmentation) during NIST Rich Transcription (RT) 2003 evaluation. The experiments were achieved on real broadcast news data (HUB4), framework of consortium. firstly shows interest segmentation acoustic macro classes (like gender or bandwidth) a front-end processing for segmentation/diarization task. impact this prior is evaluated terms performance. Secondly, two different approaches from CLIPS and LIA laboratories are presented possibilities combining them investigated. system submitted primary obtained second lower error rate compared to other RT03-participant systems. Another secondary outperformed best (i.e. it lowest rate).

参考文章(10)
Ivan Magrin-Chagnolleau, Guillaume Gravier, Raphaël Blouet, Overview of the 2000-2001 ELISA consortium research activities ISCA, A Speaker Odyssey, The Speaker Recognition Workshop. pp. 67- 72 ,(2001)
Sylvain Meignier, Jean-François Bonastre, Stéphane Igounet, E-HMM approach for learning and adapting sound models for speaker indexing ISCA, A Speaker Odyssey, The Speaker Recognition Workshop. pp. 175- 180 ,(2001)
Daniel Moraru, Laurent Besacier, Eric Castelli, Using a priori information for speaker diarization. Odyssey. pp. 355- 362 ,(2004)
S. Meignier, J.-F. Bonastre, C. Fredouille, T. Merlin, Evolutive HMM for multi-speaker tracking system international conference on acoustics, speech, and signal processing. ,vol. 2, pp. 1201- 1204 ,(2000) , 10.1109/ICASSP.2000.859181
Jean-Luc Gauvain, Lori Lamel, Gilles Adda, The LIMSI Broadcast News transcription system Speech Communication. ,vol. 37, pp. 89- 108 ,(2002) , 10.1016/S0167-6393(01)00061-9
P. Delacourt, C.J. Wellekens, DISTBIC: a speaker-based segmentation for audio data indexing Speech Communication. ,vol. 32, pp. 111- 126 ,(2000) , 10.1016/S0167-6393(00)00027-3
Corinne Fredouille, Jean-François Bonastre, Teva Merlin, AMIRAL: A Block-Segmental Multirecognizer Architecture for Automatic Speaker Recognition Digital Signal Processing. ,vol. 10, pp. 172- 197 ,(2000) , 10.1006/DSPR.1999.0367
D. Moraru, S. Meignier, L. Besacier, J.-F. Bonastre, I. Magrin-Chagnolleau, The ELISA consortium approaches in speaker segmentation during the NIST 2002 speaker recognition evaluation international conference on acoustics, speech, and signal processing. ,vol. 2, pp. 89- 92 ,(2003) , 10.1109/ICASSP.2003.1202301
Philip C. Woodland, Thomas Hain, Segmentation and Classification of Broadcast News Audio conference of the international speech communication association. ,(1998)