Experiment with asynchrony in multimodal speech communication

作者： Jonas Beskow , Björn Granström , Marie Molander

DOI:

关键词: Speech communication 、 Telephone communication 、 Mathematics 、 Analysis of variance 、 Speech recognition 、 Hearing loss 、 Intelligibility (communication) 、 Perception 、 Speech technology 、 Negative number

摘要: The purpose of this study was to examine the delay effects in audiovisual speech perception for natural and synthetic faces. main focus on SYNFACE project, development a telephone communication aid hearing impaired persons. In experiments, consequence temporal displacement audio relation visual channel investigated. with vocoder-like distortion simulate loss. Twelve different experimental conditions were presented subjects two separate sessions. face tested audio-leading (negative numbers) as well audio-lagging (positive stimuli, whereas only stimuli. Asynchronies examined 50, 175 300 ms. addition, reference examined: synchrony audio-only. Tests ANOVA including both faces revealed that neither -300 ms nor significantly better than audio-only condition, which implies final product would not be beneficial delays magnitude. -50 however, did show lower intelligibility scores synchronous condition. Unfortunately, measured present prototype is greater this. It would, therefore, interesting investigate asynchronies between -175 see exactly where drops. further showed effect type non-significant, indicating quality close face. Experiment asynchrony multimodal v tolerance larger delays, verified by significant decrease performance late at +300 (the corresponding ms). Even gain found +50 condition compared synchrony. However, significant, statistical analysis within interval [-50, +175] have small spoken message

kth.se PDF 下载加速

参考文章(15)

Steven Greenberg, Ken W. Grant, SPEECH INTELLIGIBILITY DERIVED FROM ASYNCHRONOUS PROCESSING OF AUDITORY-VISUAL INFORMATION AVSP. pp. 132- 137 ,(2001)

Jonas Beskow, Eva Agelfors, Tobias Öhman, Martin Dahlquist, Karl-Erik Spens, Magnus Lundeberg, Björn Granström, Synthetic faces as a lipreading support. conference of the international speech communication association. ,(1998)

Kunov H, Abel Sm, Pandey Pc, Disruptive effects of auditory signal delay on speech perception with lipreading. The Journal of auditory research. ,vol. 26, pp. 27- 41 ,(1986)

Jonas Beskow, Talking Heads - Models and Applications for Multimodal Speech Synthesis Institutionen för talöverföring och musikakustik. ,(2003)

Olov Engwall, Tongue Talking : Studies in Intraoral Speech Synthesis KTH. ,(2002)

L.E. Bernstein, C. Benoit, For speech perception by humans or machines, three senses are better than one international conference on spoken language processing. ,vol. 3, pp. 1477- 1480 ,(1996) , 10.1109/ICSLP.1996.607895

Norman F Dixon, Lydia Spitz, The Detection of Auditory Visual Desynchrony Perception. ,vol. 9, pp. 719- 721 ,(1980) , 10.1068/P090719

John C. Tang, Ellen Isaacs, Why Do Users Like Video? Studies of Multimedia-Supported Collaboration conference on computer supported cooperative work. ,vol. 1, pp. 163- 196 ,(1992) , 10.1007/BF00752437

Ruth Campbell, Barbara Dodd, Hearing by Eye Quarterly Journal of Experimental Psychology. ,vol. 32, pp. 85- 99 ,(1980) , 10.1080/00335558008248235

10.

Alison Macleod, Quentin Summerfield, A procedure for measuring auditory and audio-visual speech-reception thresholds for sentences in noise: rationale, evaluation, and recommendations for use. British Journal of Audiology. ,vol. 24, pp. 29- 43 ,(1990) , 10.3109/03005369009077840

Experiment with asynchrony in multimodal speech communication

来源期刊

我的账户

Experiment with asynchrony in multimodal speech communication

来源期刊

相似文章 3

The SYNFACE project - a status report

SYNFACE - a talking face telephone

Predicting Visual Intelligibility Gain in the SynFace Application

我的账户