Method and system for speaker diarization

作者: Hagai Aronowitz

DOI:

关键词:

摘要: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual and/or groups speakers obtained. Speech data with multiple is received divided into frames. For a frame, an feature vector determined extended to include log-likelihood ratios the pre-trained in relation background population model. The used segmentation clustering algorithms.

参考文章(24)
Sylvain Meignier, Claude Barras, Jean-Luc Gauvain, Xuan Zhu, Combining speaker identification and BIC for speaker diarization. conference of the international speech communication association. pp. 2441- 2444 ,(2005)
S. Chen, Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion Proc. DARPA Broadcast News Transcription and Understanding Workshop, 1998. ,(1998)
Upendra V. Chaudhari, Ganesh N. Ramaswamy, Mohamed Kamal Omar, System and method using blind change detection for audio segmentation ,(2008)
Jing Huang, Etienne Marcheret, Karthik Visweswariah, Gerasimos Potamianos, The IBM RT07 Evaluation Systems for Speaker Diarization on Lecture Meetings Multimodal Technologies for Perception of Humans. pp. 497- 508 ,(2008) , 10.1007/978-3-540-68585-2_46
Yosef A. Solewicz, Hagai Aronowitz, Speaker recognition in two-wire test sessions. conference of the international speech communication association. pp. 865- 868 ,(2008)
L. Wilcox, F. Chen, D. Kimber, V. Balasubramanian, Segmentation of speech using speaker identification international conference on acoustics, speech, and signal processing. pp. 161- 164 ,(1994) , 10.1109/ICASSP.1994.389330
Lee Davis Weinstein, Jay Loring Gainsboro, Multi-party conversation analyzer & logger ,(2015)
Mukund Panmanabhan, Stephane Herman Maes, Ramesh Ambat Gopinath, Lazaros Polymenakos, Lalit Rai Bahl, Ponani Gopalakrishnan, Transcription of speech data with segments from acoustically dissimilar environments ,(1997)