Modeling prosodic dynamics for speaker recognition

作者: A.G. Adami , R. Mihaescu , D.A. Reynolds , J.J. Godfrey

DOI: 10.1109/ICASSP.2003.1202761

关键词:

摘要: Most current state-of-the-art automatic speaker recognition systems extract speaker-dependent features by looking at short-term spectral information. This approach ignores long-term information that can convey supra-segmental information, such as prosodics and speaking style. We propose two approaches use the fundamental frequency energy trajectories to capture The first uses bigram models model dynamics of for each speaker. second a predefined set words templates then, using dynamic time warping, computes distance between from test message. results presented in this work are on Switchboard I NIST Extended Data evaluation design. show these achieve an equal error rate 3.7%, which is 77% relative improvement over system based pitch alone.

参考文章(13)
Katarina Bartkova, Delphine Charlet, David Le Gac, Denis Jouvet, Prosodic parameter for speaker identification. conference of the international speech communication association. ,(2002)
George R. Doddington, Speaker recognition based on idiolectal differences between speakers. conference of the international speech communication association. pp. 2521- 2524 ,(2001)
Mitchel Weintraub, Elizabeth Shriberg, Larry P. Heck, M. Kemal Sönmez, Modeling dynamic prosodic variation for speaker verification. conference of the international speech communication association. ,(1998)
M.J. Carey, E.S. Parris, H. Lloyd-Thomas, S. Bennett, Robust prosodic features for speaker identification international conference on spoken language processing. ,vol. 3, pp. 1800- 1803 ,(1996) , 10.1109/ICSLP.1996.607979
Walter D. Andrews, Mary A. Kohler, Joseph P. Campbell, John J. Godfrey, Jaime Hernandez-Cordero, Gender-dependent phonetic refraction for speaker recognition international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 149- 152 ,(2002) , 10.1109/ICASSP.2002.5743676
B. S. Atal, Automatic Speaker Recognition Based on Pitch Contours The Journal of the Acoustical Society of America. ,vol. 52, pp. 1687- 1697 ,(1972) , 10.1121/1.1913303
John J. Godfrey, Joshua M. Brodsky, Acoustic characteristics of emphasis Journal of the Acoustical Society of America. ,vol. 80, ,(1986) , 10.1121/1.2023828
D.E. Sturim, D.A. Reynolds, R.B. Dunn, T.F. Quatieri, Speaker verification using text-constrained Gaussian Mixture Models international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 677- 680 ,(2002) , 10.1109/ICASSP.2002.5743808
Douglas A. Reynolds, Thomas F. Quatieri, Robert B. Dunn, Speaker Verification Using Adapted Gaussian Mixture Models Digital Signal Processing. ,vol. 10, pp. 19- 41 ,(2000) , 10.1006/DSPR.1999.0361