Comparing the information conveyed by envelope modulation for speech intelligibility, speech quality, and music quality

作者: James M. Kates , Kathryn H. Arehart

DOI: 10.1121/1.4931899

关键词:

摘要: This paper uses mutual information to quantify the relationship between envelope modulation fidelity and perceptual responses. Data from several previous experiments that measured speech intelligibility, quality, music quality are evaluated for normal-hearing hearing-impaired listeners. A model of auditory periphery is used generate signals, calculated using normalized cross-covariance degraded signal with a reference signal. Two procedures describe modulation: (1) within each frequency band (2) spectro-temporal processing analyzes spectral ripple components fit successive short-time spectra. The results indicate low rates provide highest while high quality. low-to-mid frequencies most important mid Differences analysis were not significant in five six experimental conditions evaluated. different modulation-rate auditory-frequency weights may be appropriate indices designed predict types relationships.

参考文章(52)
AJ Oxenham, van der Ml Heijden, AG Armin Kohlrausch, Rwl Kortekaas, R Fassel, D Püschel, van de Sljde Steven Par, Detection of tones in low-noise noise : further evidence for the role of envelope fluctuations Acustica. ,vol. 83, pp. 659- 669 ,(1997)
James M. Kates, Kathryn H. Arehart, The Hearing-Aid Speech Quality Index (HASQI) Journal of The Audio Engineering Society. ,vol. 58, pp. 363- 381 ,(2010)
Saul A. Teukolsky, Brian P. Flannery, William T. Vetterling, William H. Press, Numerical Recipes 3rd Edition: The Art of Scientific Computing Cambridge University Press. ,(2007)
Finn Dubbelboer, Tammo Houtgast, The concept of signal-to-noise ratio in the modulation domain and speech intelligibility Journal of the Acoustical Society of America. ,vol. 124, pp. 3937- 3946 ,(2008) , 10.1121/1.3001713
Søren Jørgensen, Torsten Dau, Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing The Journal of the Acoustical Society of America. ,vol. 130, pp. 1475- 1487 ,(2011) , 10.1121/1.3621502
R. V. Shannon, F.-G. Zeng, V. Kamath, J. Wygonski, M. Ekelid, Speech recognition with primarily temporal cues. Science. ,vol. 270, pp. 303- 304 ,(1995) , 10.1126/SCIENCE.270.5234.303
Michael Nilsson, Sigfrid D. Soli, Jean A. Sullivan, Development of the Hearing In Noise Test for the measurement of speech reception thresholds in quiet and in noise The Journal of the Acoustical Society of America. ,vol. 95, pp. 1085- 1099 ,(1994) , 10.1121/1.408469
Kathryn H Arehart, James M Kates, Melinda C Anderson, Effects of noise, nonlinear processing, and linear filtering on perceived music quality International Journal of Audiology. ,vol. 50, pp. 177- 190 ,(2011) , 10.3109/14992027.2010.539273
Ning Li, Philipos C. Loizou, Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction Journal of the Acoustical Society of America. ,vol. 123, pp. 1673- 1682 ,(2008) , 10.1121/1.2832617
Belinda A. Henry, Christopher W. Turner, Amy Behrens, Spectral peak resolution and speech recognition in quiet: normal hearing, hearing impaired, and cochlear implant listeners. Journal of the Acoustical Society of America. ,vol. 118, pp. 1111- 1121 ,(2005) , 10.1121/1.1944567