Voting for two speaker segmentation

作者： Narayanaswamy Balakrishnan , Richard M. Stern , Rashmi Gangadharaiah

DOI:

关键词: Speaker diarisation 、 Change detection 、 Speech recognition 、 Computer science 、 Pattern recognition 、 Voting 、 Scale-space segmentation 、 Speaker recognition 、 Artificial intelligence 、 Cluster analysis 、 Segmentation

摘要: The process of locating the end points each speakers voice in an audio file and then clustering segments based speaker identity is called segmentation. In this paper we present a method for two segmentation, though it can be extended to more than speakers. Most methods segmentation start with initial computationally inexpensive method, followed by accurate segment clustering. describe simple algorithm that improves accuracy while not increasing computational complexity. Since done iteratively, improvement step results significant overall increase cluster purity. We borrow ideas from recognition perform frame voting. look at as independent classifier deciding which generated segment. These ’classifiers’ are combined voting make decision should clustered together. This change leads 56.9% decrease error rates on task SWITCHBOARD corpus. Index Terms: Speaker Voting combination, detection,

uni-trier.de 本地加速

isca-speech.org 本地加速

cmu.edu PDF 下载加速

researchgate.net PDF 下载加速

uni-trier.de PDF 下载加速

参考文章(10)

Narayanaswamy Balakrishnan, Balakrishnan Narayanaswamy, Rashmi Gangadharaiah, A novel method for two-speaker segmentation. conference of the international speech communication association. ,(2004)

Shrikanth S. Narayanan, Soonil Kwon, Speaker change detection using a new weighted distance measure. conference of the international speech communication association. ,(2002)

S. Chen, Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion Proc. DARPA Broadcast News Transcription and Understanding Workshop, 1998. ,(1998)

A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum Likelihood from Incomplete Data Via theEMAlgorithm Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 39, pp. 1- 22 ,(1977) , 10.1111/J.2517-6161.1977.TB01600.X

Andre G. Adam, Sachin S. Kajarekar, Hynek Hermansky, A new speaker change detection method for two-speaker segmentation IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 4, pp. 3908- 3911 ,(2002) , 10.1109/ICASSP.2002.5745511

Perrine Delacourt, Speaker-based segmentation for audio data indexing ISCA. ,(1999)

J.P. Campbell, D.A. Reynolds, Corpora for the evaluation of speaker recognition systems international conference on acoustics speech and signal processing. ,vol. 2, pp. 829- 832 ,(1999) , 10.1109/ICASSP.1999.759799

B. Narayanaswamy, R. Gangadharaiah, Extracting additional information from Gaussian mixture model probabilities for improved text independent speaker identification international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 621- 624 ,(2005) , 10.1109/ICASSP.2005.1415190

S. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 28, pp. 65- 74 ,(1980) , 10.1109/TASSP.1980.1163420

10.

H. Gish, M.-H. Siu, R. Rohlicek, Segregation of speakers for speech recognition and speaker identification international conference on acoustics, speech, and signal processing. pp. 873- 876 ,(1991) , 10.1109/ICASSP.1991.150477

Voting for two speaker segmentation

来源期刊

我的账户

Voting for two speaker segmentation

来源期刊

相似文章 7

Unsupervised sequential organization for cochannel speech separation.

Voice Analytics Process

An approach to sequential grouping in cochannel speech

A two pass algorithm for speaker change detection

Graphical models for robust speech recognition in adverse environments

Blind Speaker Clustering Using Phonetic and Spectral Features in Simulated and Realistic Police Interviews

Speech segregation in background noise and competing speech

我的账户