Linking Speech Perception and Neurophysiology: Speech Decoding Guided by Cascaded Oscillators Locked to the Input Rhythm

作者: Oded Ghitza

DOI: 10.3389/FPSYG.2011.00130

关键词: Computer scienceMotor theory of speech perceptionWord error rateIntelligibility (communication)Speech recognitionDecoding methodsSpeech processingSpeech perceptionWord recognitionNeurocomputational speech processing

摘要: The premise of this study is that current models speech perception, which are driven by acoustic features alone, incomplete, and the role decoding time during memory access must be incorporated to account for patterns observed recognition phenomena. It postulated governed a cascade neuronal oscillators, guide template-matching operations at hierarchy temporal scales. Cascaded cortical oscillations in theta, beta gamma frequency bands argued crucial intelligibility. Intelligibility high so long as these remain phase-locked auditory input rhythm. A model (Tempo) presented capable emulating recent psychophysical data on intelligibility sentences function “packaging” rate (Ghitza Greenberg, 2009). show time-compressed factor 3 (i.e., syllabic rate) poor (above 50% word error rate), but substantially restored when information stream re-packaged insertion silence gaps between successive compressed-signal intervals – counterintuitive finding, difficult explain using classical emerging naturally from Tempo architecture.

参考文章(47)
O. Ghitza, D. Messing, L. Delhorne, L. Braida, E. Bruckert, M. M. Sondhi, Towards Predicting Consonant Confusions of Degraded Speech Springer, Berlin, Heidelberg. pp. 541- 550 ,(2007) , 10.1007/978-3-540-73009-5_58
Marcel Bastiaansen, Peter Hagoort, Oscillatory neuronal dynamics during language comprehension Progress in Brain Research. ,vol. 159, pp. 179- 196 ,(2006) , 10.1016/S0079-6123(06)59012-0
Paul A. Luce, Conor T. McLennan, Spoken Word Recognition: The Challenge of Variation Blackwell Publishing Ltd. pp. 591- 609 ,(2005) , 10.1002/9780470757024.CH24
Kenneth N. Stevens, Features in Speech Perception and Lexical Access The Handbook of Speech Perception. pp. 124- 155 ,(2008) , 10.1002/9780470757024.CH6
Andrew J. Viterbi, Principles of coherent communication ,(1966)
John P. Donoghue, Jerome N. Sanes, Nicholas G. Hatsopoulos, Gyöngyi Gaál, Neural Discharge and Local Field Potential Oscillations in Primate Motor Cortex During Voluntary Movements Journal of Neurophysiology. ,vol. 79, pp. 159- 173 ,(1998) , 10.1152/JN.1998.79.1.159
B. Morillon, K. Lehongre, R. S. J. Frackowiak, A. Ducorps, A. Kleinschmidt, D. Poeppel, A.-L. Giraud, Neurophysiological origin of human brain asymmetry for speech and language Proceedings of the National Academy of Sciences of the United States of America. ,vol. 107, pp. 18688- 18693 ,(2010) , 10.1073/PNAS.1007189107
Emmanuel Dupoux, Kerry Green, Perceptual adjustment to highly compressed speech: effects of talker and rate changes. Journal of Experimental Psychology: Human Perception and Performance. ,vol. 23, pp. 914- 927 ,(1997) , 10.1037//0096-1523.23.3.914
Torsten Dau, Modeling auditory processing of amplitude modulation Journal of the Acoustical Society of America. ,vol. 101, pp. 3061- 3061 ,(1997) , 10.1121/1.418727