Modeling prosodic dynamics for speaker recognition

作者： A.G. Adami , R. Mihaescu , D.A. Reynolds , J.J. Godfrey

DOI: 10.1109/ICASSP.2003.1202761

关键词:

摘要: Most current state-of-the-art automatic speaker recognition systems extract speaker-dependent features by looking at short-term spectral information. This approach ignores long-term information that can convey supra-segmental information, such as prosodics and speaking style. We propose two approaches use the fundamental frequency energy trajectories to capture The first uses bigram models model dynamics of for each speaker. second a predefined set words templates then, using dynamic time warping, computes distance between from test message. results presented in this work are on Switchboard I NIST Extended Data evaluation design. show these achieve an equal error rate 3.7%, which is 77% relative improvement over system based pitch alone.

ieee.org 本地加速

cmu.edu 本地加速

uni-trier.de 本地加速

researchgate.net LINK 下载加速

doi.org PDF 下载加速

ieee.org LINK 下载加速

sci-hub.se PDF 下载加速

参考文章(13)

Rosaria Silipo, AUTOMATIC TRANSCRIPTION OF PROSODIC STRESS FOR SPONTANEOUS ENGLISH DISCOURSE ,(1999)

Katarina Bartkova, Delphine Charlet, David Le Gac, Denis Jouvet, Prosodic parameter for speaker identification. conference of the international speech communication association. ,(2002)

George R. Doddington, Speaker recognition based on idiolectal differences between speakers. conference of the international speech communication association. pp. 2521- 2524 ,(2001)

Mitchel Weintraub, Elizabeth Shriberg, Larry P. Heck, M. Kemal Sönmez, Modeling dynamic prosodic variation for speaker verification. conference of the international speech communication association. ,(1998)

M.J. Carey, E.S. Parris, H. Lloyd-Thomas, S. Bennett, Robust prosodic features for speaker identification international conference on spoken language processing. ,vol. 3, pp. 1800- 1803 ,(1996) , 10.1109/ICSLP.1996.607979

Walter D. Andrews, Mary A. Kohler, Joseph P. Campbell, John J. Godfrey, Jaime Hernandez-Cordero, Gender-dependent phonetic refraction for speaker recognition international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 149- 152 ,(2002) , 10.1109/ICASSP.2002.5743676

B. S. Atal, Automatic Speaker Recognition Based on Pitch Contours The Journal of the Acoustical Society of America. ,vol. 52, pp. 1687- 1697 ,(1972) , 10.1121/1.1913303

John J. Godfrey, Joshua M. Brodsky, Acoustic characteristics of emphasis Journal of the Acoustical Society of America. ,vol. 80, ,(1986) , 10.1121/1.2023828

D.E. Sturim, D.A. Reynolds, R.B. Dunn, T.F. Quatieri, Speaker verification using text-constrained Gaussian Mixture Models international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 677- 680 ,(2002) , 10.1109/ICASSP.2002.5743808

10.

Douglas A. Reynolds, Thomas F. Quatieri, Robert B. Dunn, Speaker Verification Using Adapted Gaussian Mixture Models Digital Signal Processing. ,vol. 10, pp. 19- 41 ,(2000) , 10.1006/DSPR.1999.0361

Modeling prosodic dynamics for speaker recognition

来源期刊

我的账户

Modeling prosodic dynamics for speaker recognition

来源期刊

相似文章 10

我的账户