Unsupervised speaker change detection using SVM training misclassification rate

作者: Po-Chuan Lin , Jia-Ching Wang , Jhing-Fa Wang , Hao-Ching Sung

DOI: 10.1109/TC.2007.70746

关键词:

摘要: This work presents an unsupervised speaker change detection algorithm based on support vector machines (SVM) to detect (SC) in a speech stream. The proposed is called the SVM training misclassification rate (STMR). STMR can identify SCs with less data collection, making it capable of detecting segments short duration. According experiments NIST Rich Transcription 2005 Spring Evaluation (RT-05S) corpus, has missed only 19.67 percent.

参考文章(41)
Francis Kubala, Daben Liu, Fast speaker change detection for broadcast news transcription and indexing. conference of the international speech communication association. pp. 1031- 1034 ,(1999)
Sylvain Meignier, Jean-François Bonastre, Stéphane Igounet, E-HMM approach for learning and adapting sound models for speaker indexing ISCA, A Speaker Odyssey, The Speaker Recognition Workshop. pp. 175- 180 ,(2001)
Ramesh A. Gopinath, Alain Tritschler, Improved speaker segmentation and segments clustering using the bayesian information criterion. conference of the international speech communication association. ,(1999)
P. Sivakumaran, J. Fortuna, Aladdin M. Ariyaeeinia, ON THE USE OF THE BAYESIAN INFORMATION CRITERION IN MULTIPLE SPEAKER DETECTION conference of the international speech communication association. pp. 795- 798 ,(2001)
Shrikanth S. Narayanan, Soonil Kwon, Speaker change detection using a new weighted distance measure. conference of the international speech communication association. ,(2002)
Shih-Sian Cheng, Hsin-Min Wang, METRIC-SEQDAC: A Hybrid Approach for Audio Segmentation conference of the international speech communication association. ,(2004)
Hsin-Min Wang, Lin-Shan Lee, Jeih-Weih Hung, Automatic metric-based speech segmentation for broadcast news via principal component analysis. conference of the international speech communication association. pp. 121- 124 ,(2000)
Jean-Marc Odobez, Iain A. McCowan, Guillaume Lathoud, Unsupervised Location-Based Segmentation of Multi-Party Speech international conference on acoustics speech and signal processing. ,(2004)