Detection of shouted speech in noise: Human and machine

作者: Jouni Pohjalainen , Tuomo Raitio , Santeri Yrttiaho , Paavo Alku

DOI: 10.1121/1.4794394

关键词:

摘要: High vocal effort has characteristic acoustic effects on speech. This study focuses the utilization of this information by human listeners and a machine-based detection system in task detecting shouted speech presence noise. Both female male speakers read Finnish sentences using normal voice controlled conditions, with sound pressure level recorded. The material was artificially corrupted noise supplemented pure performance statistically evaluated listening test, where subjects labeled noisy samples according to whether shouting heard or not. A Bayesian constructed evaluated. Its compared against that listeners, substituting different spectrum analysis methods feature extraction stage. Using features capable taking into account spectral fine structure (i.e., fundamental frequency its harmonics), machine reached humans even noisiest conditions. In detected significantly better than especially making smaller increase for shouting.

参考文章(39)
John H. L. Hansen, Chi Zhang, Analysis and classification of speech mode: Whispered through shouted conference of the international speech communication association. pp. 2289- 2292 ,(2007)
Mark Ordowski, Mark A. Przybocki, Alvin F. Martin, George R. Doddington, Terri Kamm, The DET Curve in Assessment of Detection Task Performance conference of the international speech communication association. ,(1997)
Martin Graciarena, Elizabeth Shriberg, Huda Jameel, Colleen Richey, Harry Bratt, Sachin S. Kajarekar, Andreas Kathol, Fred Goodman, Effects of vocal effort and speaking style on text-independent speaker verification. conference of the international speech communication association. pp. 609- 612 ,(2008)
Daniel Neiberg, Kjell Elenius, Kornel Laskowski, Emotion Recognition in Spontaneous Speech Using GMMs international conference on spoken language processing. pp. 809- 812 ,(2006)
George D. Allen, Acoustic Level and Vocal Effort as Cues for the Loudness of Speech The Journal of the Acoustical Society of America. ,vol. 49, pp. 1831- 1841 ,(1971) , 10.1121/1.1912588
J. Makhoul, Linear prediction: A tutorial review Proceedings of the IEEE. ,vol. 63, pp. 561- 580 ,(1975) , 10.1109/PROC.1975.9792
Hartmut Traunmüller, Anders Eriksson, Acoustic effects of variation in vocal effort by men, women, and children Journal of the Acoustical Society of America. ,vol. 107, pp. 3438- 3451 ,(2000) , 10.1121/1.429414
Paavo Alku, Matti Airas, Eva Björkner, Johan Sundberg, An amplitude quotient based method to analyze changes in the shape of the glottal pulse in the regulation of vocal intensity. Journal of the Acoustical Society of America. ,vol. 120, pp. 1052- 1062 ,(2006) , 10.1121/1.2211589
Jean‐Claude Junqua, The Lombard reflex and its role on human listeners and automatic speech recognizers. Journal of the Acoustical Society of America. ,vol. 93, pp. 510- 524 ,(1993) , 10.1121/1.405631
John F. Brandt, Kenneth F. Ruder, Thomas Shipp, Vocal loudness and effort in continuous speech. Journal of the Acoustical Society of America. ,vol. 46, pp. 1543- 1548 ,(1969) , 10.1121/1.1911899